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Editor’s  Comments 

In  this  issue  of  the  Journal  of  the  Washington  Academy  of  Sciences 
we  are  celebrating  the  Washington,  D.C.,  region  and  its  science  and 
technology  presence! 

Back  in  1985,  Amitai  Etzioni’s  Washington  Post  editorial,  “The 
World-Class  University  that  Our  City  Has  Become,”  was  his  personal 
statement  as  a new  resident  of  the  Washington,  D.C.,  area  in  the  mid- 
1980s.  It  provided  an  interesting  view  of  the  city’s  aspirations  in  science 
and  technology  and  policy  circles  at  that  time.  Stuart  Umpleby 
rediscovered  this  editorial  and  provides  an  updated  perspective  in 
“Intellectual  Washington  Today.”  While  Etzioni’s  emphasis  was  on  the 
policy  community  — he  called  it  “Washington  Metropolitan  University” 
or  W.M.U.  — Umpleby’ s emphasis  is  on  the  more  recent  growth  of 
information-related  activities  in  the  Washington,  D.C.  area.  Regardless, 
these  dual  perspectives  highlight  the  important  role  of  the  science  and 
technology  community  and  academic  and  policy  institutions  in  the  affairs 
of  the  Washington,  D.C.,  metropolitan  region. 

In  line  with  this  celebration  of  the  Washington  area’s  science 
presence,  it  is  fitting  that  this  issue  documents  the  Academy’s  annual 
Awards  Program  and  ceremony.  Sethanne  Howard  presented  the  keynote 
at  the  banquet  — about  the  scientist,  Benjamin  Banneker,  who  lived  in  the 
Baltimore  area  from  1731  to  1806.  The  geographical  boundary  for 
Washington,  D.C.,  was  surveyed  back  in  the  late  1700s  using  the  eclipses 
of  the  Galilean  satellites  to  determine  longitude.  As  part  of  the  survey 
team,  Banneker  timed  the  eclipses  of  the  Galilean  satellites  by  Jupiter  and 
he  kept  the  survey  clocks  running  at  the  right  time.  Thus,  the  title  of  the 
presentation  at  the  awards  ceremony  was  “Benjamin  Banneker  and 
Celestial  Navigation:  Just  How  Did  They  Know  Where  They  Were, 
Then?”.  Banneker’s  story  was  quite  interesting  at  the  banquet  ...  and  is 
now  equally  interesting  in  this  issue  of  the  Journal ! 

As  an  introduction  to  the  Academy’s  2015  Awards  Program,  we 
provide  a “backgrounder”  on  the  Awards  Program.  It  includes  an  early 
history  of  the  program,  some  traditions,  and  a primer  on  the  nominations 
process. 

Congratulations  to  these  distinguished  scientists  and  educators  in 
Washington,  D.C.,  area  scientific  institutions  whom  the  Academy  honored 
with  awards  in  their  fields  this  year:  Ronald  Colie,  Ram  D.  Sriram, 
Marcus  Cicerone,  Paul  Peterson,  Robert  Gover,  Gregory  Strouse,  and 
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MaryBeth  Petrasek.  Details  on  each  of  their  awards  are  presented  along 
with  photos  from  the  ceremony  and  banquet. 

A quantitative  study  of  the  policy  implications  of  broadband 
improvements  across  the  country  is  presented  in  the  paper  by  Paul 
Lapointe  entitled  “Does  Speed  Matter?  The  Employment  Impact  of 
Increasing  Access  to  Fiber  Internet.”  The  study  finds  a positive  association 
between  increasing  access  to  fiber  cable  and  increases  in  employment  and 
the  number  of  firms  at  the  county  level  which,  in  turn,  offers  evidence  that 
promoting  access  to  fiber  internet  is  a viable  approach  to  economic 
development. 

This  Journal  issue  also  includes  an  Addendum  to  the  Academy’s 
2014  Membership  Directory  which  appeared  in  the  Winter  2014  issue  of 
the  Journal  of  the  Washington  Academy  of  Sciences.  The  names  of  twenty 
new  members  who  were  inadvertently  omitted  from  the  2014  Membership 
Directory  are  printed  in  this  issue  instead  of  waiting  for  the  next  Directory. 
My  sincere  apologies  for  their  omission  from  our  annual  Membership 
Directory  this  past  year. 

In  addition,  in  this  Journal  issue  we  honor  the  life  of  Burton 
Hurdle,  a long-time  member  of  the  Washington  science  community  who 
passed  away  this  Spring. 

Finally,  this  Journal  issue  marks  my  last  issue  as  editor.  I’ve  edited 
the  Journal  for  three  years,  and  at  this  time  I am  handing  over  the 
editorship  to  Sethanne  Howard.  Please  send  Sethanne  your  manuscripts 
and  other  input  going  forward.  Eve  enjoyed  working  with  all  the  authors, 
reviewers,  proofreaders,  and  members  of  the  Board  of  Discipline  Editors 
who  have  contributed  their  time  so  that  we  can  maintain  high  standards  for 
the  Journal.  I’ve  been  blessed  by  the  large  number  of  talented  people 
interested  in  supporting  this  unique  peer-reviewed  interdisciplinary 
Journal.  It’s  been  an  honor  working  with  all  of  you  ...  too  many  to 
mention  individually  over  that  period  of  time  ...  please  know  that  I 
appreciate  and  thank  each  of  you  from  my  heart. 

Sally  A.  Rood,  PhD,  Outgoing  Editor 
Journal  of  the  Washington  Academy  of  Sciences 


Washington  Academy  of  Sciences 


1 


Guest  Editorial 

Intellectual  Washington  Today 

Stuart  A.  Umpleby 

The  George  Washington  University,  Washington,  D.C. 

Abstract 

In  a Washington  Post  editorial  thirty  years 
ago,  Amitai  Etzioni  described  how 
Washington,  D.C.  was  becoming  an 
intellectual  city.  Previously,  Washington 
was  viewed  as  the  home  of  the  national 
government,  journalism,  lawyers,  and 
lobbyists  but  not  as  an  academic  or 
intellectual  city.  However,  Etzioni  claimed 
that  Washington  had  become  a policy  and 
scientific  powerhouse  as  a result  not  only 
of  its  growing  and  improving  universities 
and  their  research  institutes,  but  also 
because  of  its  federal  agencies,  think  tanks,  and  policy  research 
organizations.  This  article  reviews  the  points  made  by  Etzioni  and 
examines  the  situation  today. 


Introduction 

Washington,  D.C.,  is  a city  with  many  ironic  descriptions.  It  is  often 
described  as  the  Northern-most  Southern  city.  John  F.  Kennedy  said  it  was 
a city  of  Northern  charm  and  Southern  efficiency.  It  has  been  called  a city 
full  of  former  student  body  presidents,  and  a city  consisting  of  residents 
who  come  from  somewhere  else.  Currently  Washington  may  be  known  as 
a city  of  politicians,  interns,  diplomats,  and  bloggers.  It  is  not  often 
thought  of  as  a scientific  city  or  an  intellectual  city.  However,  Washington 
has  been  growing  and  changing.  As  the  nation  becomes  increasingly  a 
post-industrial  society,  Washington  is  becoming  a leader  in  new  types  of 
organizations  and  new  kinds  of  jobs. 

A Description  of  Washington  30  Years  Ago 

To  explain  the  purpose  of  an  editorial  he  contributed  to  the 
Washington  Post  in  the  Spring  of  1985,  Amitai  Etzioni  said  that  people 
sometimes  asked  him  why  he  had  moved  from  Columbia  University  to 
Washington,  D.C.,  which  previously  had  not  had  a reputation  as  a source 
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of  innovative  ideas.  He  wrote  that  the  Washington  Metropolitan 
University  — the  combination  of  universities,  policy  research  institutes, 
and  government  agencies  — “easily  matches  the  intellectual  vigor  of 
contemporary  London,”  and  that  it  had  “almost  as  many  little  magazines 
(where  intellectuals  float  new  ideas)  and  writers-in-residence  as  the  Left 
Bank  of  Paris.” 

Etzioni  pointed  out  that  several  new  research  organizations  had 
been  added  to  the  D.C.  area  prior  to  1985:  the  Roosevelt  Center,  the 
Center  for  National  Policy,  and  the  Cato  Institute. 

He  also  noted  that  the  National  Institutes  of  Health  (NIH)  did  more 
research  in  biology  and  related  disciplines  than  was  conducted  at  Harvard, 
Yale,  Princeton,  Columbia  and  Brown  combined. 

Major  research  centers  in  economics  could  be  found  in  the  World 
Bank,  the  Federal  Reserve  Board,  and  the  Congressional  Budget  Office. 

The  natural  sciences  were  strong  in  the  Carnegie  Institution  of 
Washington,  the  Smithsonian  Institution,  the  National  Institute  of 
Standards  and  Technology  in  Gaithersburg,  Maryland,  and  the  Department 
of  Defense  (DOD). 

Etzioni  made  a distinction  between  academics  who  were  deep 
scholars  of  narrow  topics  and  intellectuals  who  took  a broader  perspective 
on  the  direction  of  American  society  and  trends  in  literature  and  the  arts. 
He  claimed  that  many  intellectuals  had  moved  to  D.C.  because  they  found 
the  academic  abundance  congenial. 

Etzioni  also  noted  that  academics  and  intellectuals  communicated 
with  each  other  not  only  in  seminars,  but  also  in  magazines  that  stimulated 
new  ideas.  As  just  a few  of  these  published  in  D.C.,  he  listed: 

• The  Wilson  Quarterly , 

• the  American  Enterprise  Institute’s  Public  Opinion , 

• Regulation , which  reports  on  the  effects  of  government 

intervention, 

• The  Cato  Journal,  and 

• Foreign  Policy  magazine,  then  a new  competitor  to  Foreign 

Affairs,  published  in  New  York. 

Etzioni  further  noted  that  Science  magazine  was  the  nation’s  leading 
journal  of  science,  and  that  the  National  Academy  of  Sciences  published 
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Issues  if 7 Science  and  Technology > (both  products  of  D.C.)  Finally,  he 
noted  that  Washington  provides  numerous  television  news  and  discussion 
programs  to  the  nation. 

As  an  academic  and  intellectual  city,  how  has  Washington 
progressed  since  1985? 

Many  Universities  in  Washington 

There  has  been  continued  growth  and  improvement  in  universities, 
particularly  the  growth  of  George  Mason  University  since  it  became 
independent  in  1972. 

The  familiar  Washington,  D.C.,  universities  — American 
University,  Catholic  University,  Georgetown  University,  George 
Washington  University,  Howard  University,  Johns  Hopkins  University’s 
School  of  Advanced  International  Studies,  the  University  of  Maryland, 
Marymount  University,  and  the  University  of  the  District  of  Columbia  — 
are  all  prospering. 

Several  well-established  universities,  for  example  George 
Washington  University  and  the  University  of  Maryland,  now  have 
buildings  in  several  parts  of  the  city.  These  locations  provide  classes  more 
conveniently  to  students  but  also  conduct  research,  as  does  George 
Washington  University’s  Virginia  Science  and  Technology  Campus  in 
close-by  Ashbum,  Virginia. 

Universities  based  in  other  cities  also  have  a presence  in  the 
Washington  area.  For  example,  Cornell  University,  New  York  University, 
Syracuse  University,  Pepperdine  University,  and  Virginia  Tech  are  all 
here. 

Clearly  universities  find  it  desirable  to  have  a connection  to 
Washington,  D.C. 

The  Growth  of  Information-Based  Activities 

The  information  intensive  activities  of  the  federal  government  have 
also  expanded  greatly  since  30  years  ago.  A few  examples  of  such 
activities  in  the  Washington  area  are  the  following: 

• The  National  Security  Agency  at  Fort  Meade,  Maryland,  has 

become  the  center  of  a “cyber  valley”  in  the  Baltimore  - 

Washington  corridor.  [Schiff,  2013] 
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• The  Dulles  access  toll  road  in  northern  Virginia  contains  the 
expanded  “beltway”  contracting  firms  and  information  services 
firms  such  as  AOL. 

• The  Route  270  corridor  in  Maryland  just  north  of  D.C.  continues  to 
be  the  home  for  biological  research,  with  key  institutions  being 
NIH,  the  Walter  Reed  Army  Medical  Center,  and  Bethesda  Naval 
Medical  Center. 

• Research  programs,  administered  at  NASA  Headquarters  and  the 
Goddard  Space  Flight  Center  in  nearby  Greenbelt,  Maryland,  have 
made  fundamental  contributions  to  improving  weather  forecasting, 
to  earth  science,  and  to  our  understanding  of  climate  change. 
NASA’s  Hubble  Space  Telescope  has  dramatically  advanced  our 
understanding  of  the  cosmos. 

• The  number  of  patents  issued  by  the  U.S.  Patent  and  Trademark 
Office  (USPTO),  headquartered  in  Alexandria,  Virginia,  in  the  past 
twenty  years  has  more  than  tripled,  from  113,268  in  1994  to 
329,613  in  2014.  [USPTO,  2015]  The  USPTO  now  has  not  only  a 
new  building  but  a new  campus  in  Alexandria,  just  south  of  D.C. 

Washington  is  definitely  a leader  in  applications  of  information 
technology.  The  Internet,  an  outgrowth  of  an  earlier  DOD  research 
project,  has  transformed  business,  government  and  personal 
communication  in  recent  years.  The  D.C.  area’s  knowledge  workers  now 
spend  hours  each  day  in  “cyberspace”  and  the  contents  of  filing  cabinets 
are  now  “in  the  cloud”  with  both  positive  and  negative  consequences. 
Cybersecurity  is  a leading  domestic  and  international  concern  and 
“identity  theft”  is  a new  worry  for  private  citizens. 

The  Washington  Post  is  now  owned  by  Amazon.com.  Many 
newspapers  have  gone  out  of  business.  There  are  now  numerous  blogs 
written  by  former  journalists. 

Improving  Management  in  Government  and  Business 

In  the  years  since  Etzioni  wrote  his  article,  there  have  been  many 
changes  in  the  federal  government  which  have  transformed  both  the 
practice  of  government  in  Washington  and  also  influenced  the 
management  of  corporations  and  state  and  local  governments. 

In  1987,  Congress  created  the  Malcolm  Baldrige  National  Quality 
Improvement  Program  aimed  at  improving  the  productivity  of  U.S.  firms, 
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which  in  the  1970s  were  having  difficulty  competing  with  Japanese 
manufacturers.  The  Baldrige  National  Quality  Award  was  expanded  to 
include  education  and  health  care  organizations  in  1 999,  and  a government 
and  non-profit  category  was  added  in  2007. 

As  an  example  of  Washington’s  growing  influence,  a 1991  General 
Accounting  Office  report  showed  how  the  Baldrige  Program  companies 
increased  their  market  share  an  annual  average  of  13.7  percent.  [Garvin, 
1991]  Such  a high  growth  rate  meant  that  companies  using  quality 
improvement  methods  in  just  a few  years  bought  or  replaced  companies 
not  using  these  methods.  A more  recent  study  said  that  participating 
companies  had  an  820:1  ratio  of  benefits  for  the  U.S.  economy  to  program 
costs.  [Link  and  Scott,  2012]  To  arrive  at  this  ratio,  they  compared  the 
benefits  received  by  the  273  Malcolm  Baldrige  National  Quality  Award 
applicants  from  2007  to  2010  with  the  cost  of  operating  the  Baldrige 
Program.  The  820-to-l  ratio  represents  only  the  benefits  for  the  surveyed 
applicants,  but  it  represents  all  of  the  Baldrige  Program’s  costs.  Link  and 
Scott  note  that  the  benefit-to-cost  ratio  would  be  much  higher  if  program 
costs  were  compared  with  benefits  for  the  entire  U.S.  economy. 

Quality  Improvement  Methods  were  taken  seriously  by  President 
Bill  Clinton  who  appointed  Vice  President  A1  Gore  to  head  the  National 
Performance  Review  in  1993.  This  initiative  had  the  goal  of  dramatically 
improving  the  performance  of  the  federal  government  through  a 
combination  of  process  improvement  methods  and  increased  contracting 
as  an  alternative  to  larger  government  agencies. 

In  March  1998,  the  National  Performance  Review  pointed  to  a 
number  of  important  achievements,  later  presented  in  Kamensky  [1999]: 

• The  size  of  the  federal  civilian  workforce  was  cut  by  351,000  — 
the  smallest  since  President  Kennedy  held  office  and,  as  a 
percentage  of  the  national  workforce,  the  smallest  since  1931. 

• Action  was  recommended  on  about  1,500  issues  in  1993  and  1995. 
Agencies  completed  about  58  percent.  Of  the  original 
recommendations,  66  percent  were  reported  as  completed.  For 
those  requiring  Presidential  or  Congressional  action,  President 
Clinton  signed  46  directives  and  Congress  passed  and  the  President 
signed  over  85  laws. 

• About  $177  billion  in  savings  were  recommended  over  a 5-year 
period.  Agencies  locked  into  place  about  $137  billion.  In  addition, 
as  of  March  1998,  the  process  improvement  award  winners 
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estimated  savings  or  cost  avoidances  of  about  $3 1 billion  because 
of  their  actions. 


• Agencies  eliminated  about  640,000  pages  of  internal  rules,  about 
16,000  pages  of  Federal  Regulations,  and  rewrote  31,000 
additional  pages  into  plain  language. 

• Agencies  sponsored  850  labor-management  partnerships.  A 1998 
survey  of  employees  showed  those  in  organizations  that  actively 
promoted  reinvention  were  twice  as  satisfied  with  their  jobs. 

• Over  570  federal  organizations  had  committed  to  more  than  4,000 
customer  service  standards. 

Kamensky  [1999]  also  reported  that  public  trust  in  the  federal  government 
was  increasing  after  a 30-year  decline.  While  it  was  not  clear  whether  this 
improvement  was  directly  linked  to  the  work  of  the  National  Performance 
Review,  the  Review  made  an  important  contribution. 

The  Use  of  Information  in  Policy-Making 

Who  analyzes  information  and  writes  reports  in  the  D.C.  area  has 
also  been  changing.  Since  Etzioni  wrote  his  article,  the  Office  of 
Technology  Assessment  was  closed  and  the  number  of  Congressional  staff 
was  significantly  reduced  during  the  time  that  Newt  Gingrich  was  Speaker 
of  the  U.S.  House  of  Representatives.  As  a result,  it  can  be  said  that  the 
task  of  providing  background  information  for  legislation  has  been  taken  up 
by  lobbying  firms  on  K Street  which  often  draft  new  legislation,  a task 
previously  performed  by  Congressional  staff  members.  [Benen,  2011] 

Also,  political  activity  has  moved  from  public  demonstrations  on 
the  mall  to  campaign  contributions  and  lobbying  behind  closed  doors.  It  is 
harder  now,  in  2015,  to  know  what  changes  in  laws  are  occurring.  Hedrick 
Smith  [2012]  in  his  recent  book  notes  that  when  he  was  head  of  the 
Washington  bureau  of  The  New  York  Times,  he  did  not  realize  that  a series 
of  laws  and  court  decisions  were  fundamentally  changing  taxes  and 
entitlement  programs  beginning  in  the  late  1970s.  Over  time,  these 
changes  have  led  to  a dramatic  increase  in  inequality  in  the  United  States, 
which  has  affected  all  U.S.  citizens. 

Conclusion 

In  his  article  30  years  ago,  Amitai  Etzioni  focused  primarily  on  the 
many  policy  research  organizations  in  Washington.  During  his  years  as  a 
professor  at  the  George  Washington  University,  Etzioni  himself  has  made 


Washington  Academy  of  Sciences 


7 


notable  contributions  to  policy  research  and  discussions.  He  created  the 
Society  for  the  Advancement  of  Socio-Economics  and  an  academic 
journal,  the  Journal  for  the  Advancement  of  Socio-Economics.  He  founded 
and  leads  the  Communitarian  Network,  a non-profit,  non-partisan 
organization  dedicated  to  supporting  the  moral,  social  and  political 
foundations  of  society.  He  is  currently  Director  of  the  Institute  for 
Communitarian  Policy  Studies  at  George  Washington  University. 

Of  course  not  all  of  the  information-related  activities  in  the 
Washington  area  involve  transformative  policy  analyses.  Much  of  the 
work  — for  example,  at  the  Patent  Office  and  the  National  Security 
Agency  — requires  careful  attention  to  detail.  In  the  past  30  years,  the 
number  of  information-related  jobs  in  the  Washington,  D.C.,  area  has 
increased  dramatically. 

However,  large  organizations  that  conduct  these  information 
processing  activities  create  a demand  for  educated  workers  and,  just  as 
importantly,  for  additional  innovations  in  handling  information-related 
tasks.  For  this  reason,  several  local  universities  have  recently  started 
degree  programs  in  big  data,  data  analytics,  and  cyber  security. 

The  “post-industrial”  society  which  has  exploded  in  the  D.C.  area 
in  the  past  30  years  has  also  been  growing  globally.  Around  the  world, 
new  universities  are  being  established  and  are  improving.  In  any  event, 
Washington,  D.C.,  is  well-positioned  to  be  a leading  city  in  this  post- 
industrial era. 

Overall,  the  city-wide  university  that  Etzioni  described  30  years 
ago  is  a key  player  in  defining  and  creating  the  nation  and  the  world  in  the 
21st  Century. 
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Increasing  Access  to  Fiber  Internet 
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Abstract 

As  internet  technology  continues  to  improve  at  a rapid  pace,  there  is 
constant  debate  over  the  relative  value  of  various  internet  connection 
technologies.  In  recent  years,  policymakers  have  debated  over  several 
important  questions  regarding  broadband.  What  speed  qualifies  as  high- 
speed broadband?  How  much  public  funding  should  be  spent  increasing 
access  to  broadband?  And,  what  regulations  to  impose  on  internet 
providers?  While  the  potential  and  proven  benefits  of  high-speed 
internet  are  diverse,  the  economic  impacts  are  often  at  the  forefront  of 
policy  discussions.  To  date,  most  research  into  the  economic  impact  of 
internet  has  focused  on  increasing  access  to  and  adoption  of  broadband 
internet  in  general,  without  emphasizing  the  speed  of  the  broadband 
connections.  This  paper  utilizes  new  data  available  as  a result  of  the 
American  Recovery  and  Reinvestment  Act  to  examine  the  relationship 
between  employment  growth  and  access  to  fiber  internet,  currently  seen 
as  the  gold  standard  of  internet  connections  in  terms  of  speed  and 
reliability.  Using  data  from  the  National  Broadband  Map,  this  study 
finds  a positive  association,  within  the  United  States,  between 
increasing  access  to  fiber  and  increases  in  employment  and  number  of 
firms  at  the  county  level.  This  positive  relationship  provides  evidence 
to  policymakers  that  promoting  access  to  fiber  internet  is  a viable 
economic  development  approach. 

Introduction 

Although  there  is  a strong  consensus  that  high-speed  internet  is 
related  to  economic  growth,  many  questions  remain  about  what  speed  is 
optimal.  As  the  internet  becomes  ubiquitous  in  the  United  States,  attention 
has  shifted  from  expanding  access  to  the  internet  towards  improving  the 
connections  that  Americans  have  access  to.  Table  1 shows  the  percent  of 
U.S.  households  that  have  access  to  different  types  of  internet 
technologies.  Almost  all  households  have  access  to  some  form  of  internet 
connection,  whether  it  is  a fixed  line  connection,  wireless  internet,  or 
satellite.  Additionally,  95  percent  of  households  have  access  to  fixed  line 
internet,  including  87  percent  that  have  access  to  a cable  internet 
connection.  The  opportunity  that  remains  is  in  expanding  access  to  state  of 
the  art  technologies  such  as  optical  fiber,  where  access  is  expanding  in 
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recent  years,  but  still  remains  out  of  reach  for  most  American  households, 
as  only  one  in  four  households  has  access  to  it. 


Table  1.  Percent  of  U.S.  Households  with  Access  to  Internet  Technologies  in  June 

201 1 and  December  2013 


June  2011 

December  2013 

Change 

Any  internet 

99% 

99% 

0% 

Any  fixed  line  internet 

95% 

95% 

0% 

Cable  internet 

83% 

87% 

5% 

Optical  fiber  internet 

17% 

24% 

7% 

Source:  National 

3roadband  Map 

Over  the  past  two  decades,  the  policy  focus  has  been  on 
increasing  broadband  access  and  adoption.  In  the  aggregate,  these  efforts 
have  largely  been  successful.  Broadband  access  (using  the  Organisation 
for  Economic  Co-operation  and  Development  (OECD)  definition  of  256 
Kbit/sec)  in  the  United  States  has  increased  from  4.4  percent  of 
households  having  access  in  2000  to  19.9  percent  of  households  in  2003 
and  68.2  percent  in  2010  (OECD,  2014).  Now  that  broadband  access  is 
wider,  many  policymakers  have  shifted  away  from  increasing  access 
towards  increasing  speed.  The  demand  for  high  speed  is  clear;  when 
Google  announced  plans  to  pilot  its  Google  Fiber  networks,  more  than 
1,100  communities  across  the  country  applied  (Kelly,  2010).  Absent 
private  investment,  some  municipalities  have  dedicated  vast  tax  payer 
resources  to  construct  fiber  networks  of  their  own.  Clearly,  effort  is  being 
put  into  improving  internet  connections,  yet  there  is  little  empirical 
evidence  as  to  whether  these  ultra-high-speed  networks  promote  growth 
beyond  the  benefits  of  more  common  speeds.  The  purpose  of  this  paper  is 
to  examine  the  economic  impact  of  fiber  internet  availability  in  the  United 
States. 

Now,  thanks  to  recent  enhancements  to  the  Federal 
Communication  Commission’s  (FCC)  data  collection  strategies, 
researchers  have  access  to  data  which  allows  the  examination  of  the 
impact  of  fiber  networks  for  the  first  time.  By  evaluating  the  economic 
impact  of  fiber  internet,  information  can  be  provided  to  policymakers  to 
help  guide  them  in  determining  the  amount  of  resources  to  invest  in  the 
technology. 
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Literature  Review 

There  is  a growing  body  of  literature  on  the  economic  impact  of 
high  speed  internet  both  in  the  United  States  and  around  the  world;  the 
general  consensus  is  that  high  speed  internet  leads  to  economic  growth 
(Qiang,  2009;  Van  Gaasbeck,  2008;  Whitacre  et  al 2013;  Kolko,  2012). 
The  literature  can  be  divided  into  research  on  differences  in  broadband 
technology  across  countries  and  differences  in  broadband  technology 
within  a single  country.  While  this  paper  will  focus  on  the  effect  of 
broadband  differences  in  the  United  States,  it  is  important  to  examine  the 
literature  in  both  areas  to  build  a cohesive  picture  of  the  state  of  research 
on  economic  effects  of  broadband. 

International  Literature 

As  a whole,  the  literature  on  country-level  effects  of  broadband 
technology  shows  that  countries  with  higher  levels  of  broadband 
penetration  have  generally  higher  levels  of  GDP  growth.  Czemich  et  al. 
(2009)  used  data  from  a panel  of  25  OECD  countries  between  1 996  and 
2007  to  create  a model  — using  pre-existing  telephone  and  TV  networks 
to  predict  maximum  broadband  penetration  rates  — to  examine  economic 
impact.  They  (2009)  found  a statistically  significant  positive  relationship; 
a 10  percent  increase  in  broadband  penetration  raised  GDP  per  capita  by 
0.9- 1.5  percent. 

Similarly,  Qiang  and  Rossotto  (2009)  used  Information 
Communications  and  Technologies  Development  (ICTD)  and  World  Bank 
data  for  120  countries  between  1980  and  2006  to  understand  how  differing 
broadband  penetration  rates  are  related  to  GDP  per  capita  growth.  They 
estimated  that  a 10  percent  increase  in  broadband  adoption  is  associated 
with  a 1.21  percent  increase  in  GDP  per  capita  for  developing  countries 
and  a 1.38  percent  increase  for  developed  countries.  However,  Qiang  and 
Rossotto  (2009)  caution  that  causality  is  not  abundantly  clear;  that  is,  there 
could  be  a back  and  forth  effect  as  increased  wealth  also  increases  the 
demand  for  broadband  services.  Koutroumpis  (2009)  attempted  to  account 
for  the  fact  that  broadband  can  both  influence  and  be  influenced  by 
economic  factors  using  a simultaneous  equation  model  to  identify  the 
macro  impact  of  broadband  in  15  European  Union  countries  between  2003 
and  2006.  He  separated  the  increased  demand  for  broadband  caused  by 
increased  wealth  from  the  economic  growth  caused  by  increased 
broadband  usage  with  models  that  predict  the  supply  and  demand  for 
broadband  growth.  After  separating  out  these  influences,  there  was  still  a 
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significant,  positive  relationship  between  broadband  penetration  and  GDP 
per  capita. 


National  Literature 

Research  within  the  United  States  has  been  building  over  time, 
with  researchers  using  a variety  of  datasets  and  approaches  to  building 
models.  While  the  approaches  vary,  there  is  a consensus  that 
improvements  in  broadband  technology  are  related  to  higher  levels  of 
employment,  although  findings  are  mixed  on  other  economic  indicators 
such  as  number  of  firms  and  income. 

The  early  literature  in  the  United  States  focused  on  building  cross- 
sectional  panel  models  that  take  advantage  of  varying  levels  of 
technological  development  across  regions  of  the  country.  Lehr  et  al. 
(2005)  used  data  from  the  FCC  form  477  and  the  Population  Censuses  and 
Establishments  Surveys  to  investigate  the  effect  of  broadband  presence  (as 
a binary  measure)  on  economic  indicators  such  as  employment,  wages, 
and  industry  mix.  Their  model,  which  used  data  from  1998-2002,  showed 
that  in  zip  codes  with  mass-market  broadband  availability  there  was  higher 
employment,  more  firms  overall,  and  more  firms  in  the  IT  sector.  The 
broadband  speed  studied  was  200  kilobits  per  second,  speeds  that  would 
now  be  considered  slow.  Moreover,  the  Lehr  et  al.  (2005)  study  showed 
the  tradeoffs  associated  when  choosing  to  study  broadband  at  the  state  or 
community  level  in  the  United  States.  Crandall  et  al.  (2007)  built  on  this 
model  with  data  from  2003-2005  to  examine  state  level  GDP  growth 
associated  with  increased  broadband  penetration.  While  they  found  that 
higher  levels  of  broadband  penetration  were  associated  with  higher  levels 
of  GDP  growth,  the  results  were  not  statistically  significant,  which 
reinforces  the  notion  that  state-level  data  are  too  broad  to  study  broadband 
in  America.  While  several  dependent  variables  of  interest,  such  as  GDP, 
are  not  available  at  smaller  geographical  units  than  the  state,  there  is 
generally  not  sufficient  variation  between  states  in  broadband  availability 
to  draw  meaningful  conclusions. 

While  much  of  the  research  in  the  United  States  uses  FCC  data, 
two  studies  in  2007  corroborate  the  larger  national  studies  using  different 
data  sources.  Van  Gaasbeck  (2008)  used  cross-sectional  panel  household 
survey  data  from  Scarborough  Research  to  examine  the  potential 
employment  effects  of  expanding  broadband  adoption  in  California.  They 
found  that  increased  broadband  adoption  was  associated  with  higher 
employment  but  fewer  establishments.  Similarly,  Shideler  et  al.  (2007) 
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looked  at  county-level  effects  for  a single  state,  Kentucky.  They  focused 
on  increased  broadband  availability,  instead  of  broadband  adoption.  Using 
infrastructure  data  from  providers  collected  through  ConnectKentucky, 
they  examined  county-level  employment  growth  and  sector  employment 
growth  relative  to  broadband  availability,  controlling  for  past  growth, 
education,  unemployment,  and  road  density.  They  found  a positive, 
statistically  significant  relationship  between  broadband  availability  and 
total  employment.  While  the  limited  scope  of  these  studies  restricts  the 
applicability  to  broader  national  policies,  they  help  to  validate  the  general 
association  between  broadband  and  employment. 

In  a qualitative  analysis,  Ezell  et  al.  (2009)  made  the  case  for 
facilitating  the  development  of  internet  with  speeds  of  at  least  20  Mbit/sec 
downloading  and  preferably  50  Mbit/sec  or  greater.  While  most  policy 
efforts  focus  on  increasing  broadband  adoption  and  availability,  the 
authors  encourage  policymakers  to  consider  efforts  to  increase  speeds  as 
well.  They  count  fiber  to  the  home,  fiber  to  the  node,  and  DOCSIS  3.0 
cable  as  the  most  desirable  fixed-line  broadband  delivery  methods  and  4G 
as  the  most  desirable  wireless  delivery  method.  They  point  out  that 
countries  such  as  Japan,  Singapore,  South  Korea,  and  Sweden  are  far 
ahead  of  the  United  States  in  terms  of  high  speed  internet,  giving  them  an 
advantage  in  developing  innovative  web-based  applications.  In  order  for 
the  United  States  to  remain  the  global  leader  in  internet  based  innovation, 
they  contend  that  there  needs  to  be  a greater  focus  on  increasing 
broadband  speed. 

There  has  been  a current  focus  on  the  impact  of  broadband 
expansion  in  rural  communities  in  particular.  While  high  speed  internet 
has  become  standard  in  most  urban  and  suburban  communities,  lower 
population  density  makes  it  much  more  costly  for  providers  to  expand  into 
rural  areas.  Therefore,  many  policy  initiatives  have  focused  on  how  the 
government  can  play  a role  in  expanding  access  in  rural  communities. 
Stenberg  et  al.  (2009)  match  rural  counties  that  had  broadband  by  2000 
with  those  that  did  not  based  on  a variety  of  characteristics  in  order  to  test 
a causal  relationship  between  broadband  and  economic  growth  in  rural 
counties.  They  aggregated  FCC  Form  477  data  to  measure  broadband 
availability  and  found  faster  employment  growth  in  counties  with  more 
availability.  There  is  also  evidence  that  counties  which  had  early  adoption 
of  broadband  experienced  relative  income  growth,  but  this  faded  over  time 
as  broadband  became  more  profuse.  Whitacre  et  al.  (2013)  used  data 
newly  available  from  the  National  Broadband  Map  combined  with 
adoption  rates  from  FCC  Form  477  to  examine  economic  impacts  of 
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broadband  expansion  into  rural  communities.  They  used  three  different 
techniques  to  examine  the  relationship  between  broadband  and  economic 
health.  The  collective  results  indicated  a positive  relationship  between 
rural  economic  indicators  and  broadband  availability.  They  concluded  that 
adoption  thresholds  had  more  of  an  impact  than  availability  thresholds 
(Whitacre  et  at.,  2013). 

In  regards  to  the  debate  over  whether  to  use  adoption  or 
availability  as  the  key  indicator  of  broadband  penetration,  Kolko  (2012) 
made  the  case  for  availability.  He  pointed  out  that  adoption  rates  can  be 
influenced  by  economic  growth  more  so  than  availability.  Additionally, 
increasing  availability  is  a more  feasible  approach  for  policymakers  than 
increasing  adoption  rates.  Using  cross-sectional  panel  data  from  the  FCC 
between  1999  and  2006,  Kolko  built  a model  to  identify  the  impact  of 
availability  on  local  level  employment  and  county-level  labor  market 
outcomes.  He  found  a statistically  significant,  positive  relationship 
between  broadband  expansion  and  local  employment,  but  cautions  that  the 
increased  employment  is  accompanied  by  increased  population  growth, 
resulting  in  no  impact  to  employment  rates. 

In  2013,  NC  Broadband  hosted  a research  roundtable  to  discuss  the 
state  of  research  on  the  economic  and  community  impact  of  broadband 
expansion  (Feser  et  al.,  2013).  The  final  report  suggested  that  there  is  a 
need  for  more  research  on  specific  broadband  policies  and  investments  at 
the  margin;  including  increases  in  broadband  speeds  and  reliability  and 
use  of  new  technology.  This  paper  will  attempt  to  fill  some  of  that  gap.  It 
benefits  from  the  requirement  in  the  National  Broadband  Plan  that  states 
collect  more  detailed  information  on  different  technologies  and  speeds 
available  at  local  levels.  Using  this  new  dataset,  it  is  now  possible  to  start 
evaluating  whether  or  not  incremental  expansion  of  the  presence  of  fiber 
technology  is  associated  with  increased  economic  growth. 

Study  Hypothesis 

The  central  hypothesis  being  tested  in  this  study  is  that,  within  the 
United  States,  increasing  access  to  fiber  internet  connections  is  related  to 
increased  levels  of  economic  growth,  as  measured  by  employment  levels, 
number  of  firms,  and  income.  Broadband,  in  general,  can  lead  to  economic 
growth  in  several  ways.  By  connecting  individuals  and  companies  across 
the  globe,  the  internet  can  make  it  easier  for  small  and  medium  sized  firms 
to  do  business  with  suppliers  and  customers  that  they  otherwise  would  not 
have  interactions  with.  Further,  individuals  are  able  to  use  the  internet  to 
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connect  with  employers  and  potential  work  remotely  for  companies 
anywhere  in  the  world,  opening  up  more  employment  opportunities  and 
facilitating  virtual  talent  mobility.  Lastly,  we  would  expect  a short-term 
increase  in  employment  due  to  the  fact  that  creating  the  connections 
requires  the  hiring  of  employees  to  dig  up  cables,  install  new  lines,  and 
provide  on-going  maintenance  services.  Because  fiber  internet  provides  a 
faster,  more  reliable  connection  that  allows  the  almost-instantaneous 
transfer  of  large  amounts  of  data,  it  is  likely  that  these  effects  are  enhanced 
beyond  what  would  be  expected  with  more  common  speed  levels. 

Data 

Much  of  the  prior  literature  in  the  United  States  used  FCC  form 
477  data  to  understand  where  broadband  technology  was  available.  While 
this  dataset  provided  a relatively  complete  picture,  it  did  not  offer  insight 
into  different  speeds  within  each  geographical  region.  As  part  of  the 
American  Recovery  and  Reinvestment  Act  of  2009,  the  National 
Broadband  Map  was  commissioned.  The  National  Broadband  Map 
provided  funding  for  each  state  to  gather  more  detailed  internet  data.  The 
methodology  used  by  each  state  to  obtain  these  data  differs  slightly,  but 
there  are  set  data  fields  that  each  state  is  required  to  provide.  This  semi- 
annual data  release  is  what  allows  the  examination  being  conducted  in  this 
study.  The  data  are  made  available  in  several  formats,  such  as  the  analyze 
tables  that  aggregate  internet  statistics  by  region  with  accompanying 
descriptive  data  that  can  be  used  in  modeling  efforts.  The  first  analyze 
table  to  be  released  was  in  201 1 and  it  has  been  released  every  six  months 
subsequently. 

Dependent  variables  for  this  model  will  come  from  the  Quarterly 
Census  of  Employment  and  Wages  (QCEW)  survey  conducted  by  the 
Bureau  of  Labor  Statistics  (BLS).  The  QCEW  provides  county-level 
summaries  of  a variety  of  economic  indicators,  including  employment, 
number  of  firms,  and  average  annual  pay,  broken  down  into  industry  and 
sector.  In  order  to  match  up  with  the  National  Broadband  Map  data, 
annual  average  survey  data  released  between  2011  and  2013  will  be  used. 

Combining  the  National  Broadband  Map  data  with  the  QCEW  data 
results  in  a dataset  that  contains  3,142  counties  with  6 observations  per 
county.  As  shown  in  Table  2,  between  the  first  and  last  time  period, 
roughly  two-thirds  of  the  counties  experienced  an  increase  in  access  to 
fiber  internet.  Counties  with  an  increase  in  access  to  fiber  experienced 
substantially  more  employment  growth  than  counties  that  did  not,  and  also 
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had  greater  changes  in  the  number  of  total  firms  and  total  average  weekly 
pay.  This  provides  some  initial  evidence  of  a positive  relationship  between 
access  to  fiber  internet  and  employment  growth;  however,  a simple 
difference  in  means  comparison  is  not  sufficient  to  draw  policy 
conclusions  from.  There  could  be  a variety  of  factors  that  contribute  to 
both  job  growth  and  improved  internet  infrastructure.  Further,  different 
counties  saw  drastically  different  changes  in  internet  access  and 
employment  growth. 

Table  3 breaks  up  the  counties  that  experienced  an  increase  in 
access  into  quartiles  (based  on  percent  of  households  with  access  to  fiber). 
The  relationship  between  the  magnitude  of  the  increase  in  access  and  the 
change  in  economic  indicators  is  more  complex  than  the  binary 
comparison,  although  there  are  some  indicators  where  there  is  clearly  a 
positive  correlation,  such  as  number  of  total  firms.  This  provides  evidence 
for  using  a continuous  rather  than  discrete  or  binary  variable  for  access  to 
fiber  internet. 


Table  2.  Comparing  Economic  Indicator  Changes  by  Changes  in  Access  to  Fiber 
between  June  201 1 and  December  2013 


Negative  or  No 
Change  in  Access 
to  Fiber 

Positive 
Change  in 
Access  to 

Fiber 

Number  of  Counties 

1,008 

2,134 

Average  % Change  in  Employment 

0.66% 

2.49% 

Average  % Change  in  Firms 

1.31% 

1.67% 

Average  % Change  in  Average  Weekly  Pay 

10.13% 

10.98% 

Average  % Change  in  Private  Employment 

1.89% 

3.51% 

Average  % Change  in  Private  Firms 

2.35% 

2.31% 

Average  % Change  in  Private  Average  Weekly  Pay 

15.59% 

15.51% 

Source:  National  Broadband  Map 


Methodology 

This  study  uses  a two-way  fixed  effects  regression1  to  evaluate  the 
relationship  between  access  to  fiber  internet  connections  and  economic 
growth.  The  model  has  fixed  effects  for  county  and  for  time.  A fixed 
effects  regression  is  superior  to  a simple  cross-sectional  model  or  a pooled 
ordinary  least  squares  model  in  these  circumstances  because  it  allows  the 
model  to  control  for  unmeasured  characteristics  of  counties  that  may  be 
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correlated  with  access  to  fiber  technology  and  influence  measures  of 
economic  growth  in  addition  to  factors  that  were  common  across  all 
counties  for  any  given  time  period.  If  the  hypothesis  holds,  counties  that 
experience  increases  in  access  to  fiber  internet  will  have  greater  increases 
in  employment  than  counties  that  have  no  change  in  high  speed  internet. 
While  the  fixed  effects  model  will  not  definitively  prove  causality,  it  does 
provide  a stronger  case  for  causality  than  a cross-sectional  model 
(Whitacre  et  al.,  2013). 


Table  3.  Comparing  Economic  Indicator  Changes  between  2011  and  2013  for  Counties 

that  Increased  Fiber  Access 


< 25% 
Change  in 
Access 

25%  - 50% 
Change  in 
Access 

50%  - 75% 
Change  in 
Access 

>75% 
Change  in 
Access 

Number  of  Counties 

1,803 

205 

96 

30 

Average  % Change  in 

Employment 

2.41% 

1.78% 

4.75% 

4.67% 

Average  % Change  in  Firms 

1.48% 

2.08% 

3.61% 

3.55% 

Average  % Change  in  Average 
Annual  Pay 

11.02% 

10.94% 

10.20% 

11.80% 

Average  % Change  in  Private 
Employment 

3.52% 

2.05% 

5.58% 

6.32% 

Average  % Change  in  Private 

Firms 

2.07% 

3.09% 

4.66% 

3.87% 

Average  % Change  in  Private 
Average  Annual  Pay 

15.55% 

15.82% 

13.62% 

17.02% 

Source:  National  Broadband  Map 


The  independent  variable  of  interest  will  be  percent  of  households 
within  a region  that  have  access  to  fiber  internet  technology.  Due  to  data 
limitations,  the  percent  of  households  having  access  serves  as  a proxy  for 
both  individuals  and  businesses  having  access  in  that  region.  Because 
GDP  is  not  available  at  the  county  level,  the  primary  dependent  variable 
will  be  employment,  which  is  available.  Additionally,  the  number  of  firms 
and  average  annual  wages  will  be  used  in  order  to  provide  a more 
comprehensive  overview  of  the  economic  impact.  For  both  employment 
and  firms,  natural  logs  will  be  used  so  that  the  results  are  meaningful 
across  counties  of  drastically  different  sizes.  By  running  each  model  for 
both  the  private  sector  and  total  economy,  fiber  internet’s  impact  on  the 
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private  sector  and  the  public  and  non-profit  sector  can  be  contrasted. 
Control  variables  for  county  demographics  and  access  to  cable  internet  are 
included  to  isolate  the  relationship  between  fiber  internet  and  employment. 

Exhibit  1 shows  the  model  and  variables  that  are  the  main  focus  of 
this  paper. 


Exhibit  1.  Model  Variables  and  Predicted  Relationships 


Yit  ~ Po  + PiXut  + P2X2U  + P3X3U  + P4X4U  + Ps^sit  + ai  + at 


where: 


Variable 

Variable  Name 

Definition 

Predicted 

Relationship 

Study 

Y 

Ln(total  employment) 

The  natural  log  of  total 
employment 

Crandall,  Lehr,  Litan 

Y 

(Alternate) 

Ln(private 

employment) 

The  natural  log  of  private 
employment 

Crandall,  Lehr,  Litan 

Y 

(Alternate) 

Ln(total  firms) 

The  natural  log  of  total 
firms 

Whitacre,  Gallardo, 
Strover 

Y 

(Alternate) 

Ln(private  firms) 

The  natural  log  of  private 
firms 

Whitacre,  Gallardo, 
Strover 

Y 

(Alternate) 

Ln(total  annual 
average  wage) 

The  natural  log  of  total 
average  weekly  wages 

Whitacre,  Gallardo, 
Strover 

Y 

(Alternate) 

Ln(private  annual 
average  wage) 

The  natural  log  of  private 
average  weekly  wages 

Whitacre,  Gallardo, 
Strover 

XI 

Households  with 
access  to  optical  fiber 

The  percent  of  households 
within  a county  that  have 
access  to  optical  fiber 
internet 

Positive 

X2 

Ln(population) 

The  natural  log  of  the 
county  population 

Positive 

Whitacre,  Gallardo, 
Strover 

X3 

Adults  with  bachelors 
or  greater 

The  percent  of  the 
population  with  a 
bachelor's  degree  or 
greater 

Positive 

Crandall,  Lehr,  Litan 

X4 

Median  Household 

Income 

The  median  income  for 
households  within  a 
county 

Positive 

Whitacre,  Gallardo, 
Strover 

X5 

Households  with 
access  to  cable 

The  percent  of  households 
within  a county  that  have 
access  to  cable  internet 

Positive 

Whitacre,  Gallardo, 
Strover 

“l 

County  fixed  effects 

Controls  for  unmeasurable 
and  constant  differences 

between  counties 

Time  fixed  effects 

Controls  for  unmeasurable 
and  constant  differences 
between  time  periods 
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The  model  will  examine  the  relationship  at  a county  level;  Kolko 
(2011)  showed  that  a state-level  model  is  too  aggregated  to  show 
statistically  significant  differences  in  access  to  broadband.  While  there  is 
substantial  variation  in  change  in  access  to  fiber  internet  at  the  state-level, 
the  small  sample  size  and  fact  that  most  states  are  clumped  at  the  lower 
end  of  the  spectrum  would  likely  lead  to  a similar  finding  in  this  dataset. 
Table  4 presents  state  level  fiber  optic  data.  Figures  1 and  2 illustrate  that 
there  is  much  variation,  at  a county-level,  in  the  level  of  access  to  fiber 
technology,  providing  a robust  dataset  on  which  to  conduct  analysis. 
Further,  there  is  very  little  geographical  concentration  to  where  fiber  is 
being  deployed,  which  will  allow  the  results  of  this  model  to  be  applied 
across  all  of  the  United  States. 

The  National  Broadband  Map  began  data  collection  in  2010; 
however,  there  were  concerns  over  the  quality  of  the  first  year’s  data 
collection  methodology;  the  data  were  cleaned  up  for  subsequent  years 
(Whitacre  et  al.,  2013).  Therefore,  this  study  will  examine  data  from  each 
of  the  releases  in  2011,  2012  and  2013.  While  a larger  dataset  would  be 
ideal  in  order  to  understand  the  lasting  effect  of  increasing  access  to  fiber 
internet,  available  data  are  sufficient  to  provide  early  evidence  on  the 
relationship  between  access  to  fiber  internet  and  economic  indicators. 
Policymakers  will  not  delay  actions  for  the  next  few  years  in  order  to 
collect  more  data;  neither  should  researchers. 

Results 

The  results  for  the  primary  dependent  variable,  total  employment, 
are  displayed  in  Table  5.  Column  (1)  shows  a simple  one-way  fixed 
effects  model  with  no  control  variables;  the  coefficient  on  access  to  fiber 
internet  is  highly  statistically  significant,  with  a t-statistic  of  over  ten. 
When  the  natural  log  of  population  is  controlled  for  in  column  (2),  the 
coefficient  and  its  significance  do  not  change  substantially;  the  R squared 
value  rises  from  .017  to  .949,  though.  This  is  as  expected,  as  the 
overwhelming  determinant  of  how  many  employed  people  are  in  a county 
will  be  population.  In  column  (3),  controls  for  changes  in  demographic 
characteristics  are  added  in.  While  the  inherent  wealth  and  education  of 
each  county  are  absorbed  by  the  unit  fixed  effects,  adding  these  variables 
accounts  for  any  changes  in  income  and  education  level  that  may  have 
occurred  over  the  time  period  studied.  We  see  that  both  of  these  controls 
are  statistically  significant,  as  we  would  expect  since  wealth  and  education 
are  traditionally  positively  correlated  with  employment. 


Spring  2015 


20 


Table  4.  Percent  of  Households  with  Access  to  Fiber  by  State,  Ranked  by  Access  to 
Fiber  in  December  2013,  in  both  June  201 1 and  December  2013 


State  Name 

Counties 

Access  to  Fiber 

June  2011 

Access  to  Fiber 

Dec.  2013 

Change  in 
Access  to  Fiber 

Change  in 
Employment 

Rhode  Island 

5 

78.81% 

97.05% 

18.24% 

2.26% 

Oregon 

36 

37.32% 

73.98% 

36.66% 

5.30% 

South  Dakota 

66 

64.78% 

70.33% 

5.55% 

2.69% 

Montana 

56 

1.57% 

65.30% 

63.73% 

3.42% 

New  Jersey 

21 

54.71% 

59.50% 

4.79% 

2.53% 

North  Dakota 

53 

15.53% 

59.40% 

43.87% 

16.23% 

New  York 

62 

47.52% 

57.76% 

10.24% 

4.34% 

Maryland 

24 

52.51% 

55.89% 

3.38% 

2.41% 

Delaware 

3 

48.90% 

50.00% 

1.10% 

3.66% 

Pennsylvania 

67 

45.67% 

48.22% 

2.55% 

1.27% 

Utah 

29 

12.36% 

46.10% 

33.74% 

8.77% 

Indiana 

92 

3.46% 

44.65% 

41.19% 

4.58% 

Connecticut 

8 

6.01% 

44.52% 

38.51% 

2.14% 

Virginia 

134 

38.90% 

42.88% 

3.98% 

1.90% 

D.C. 

1 

20.13% 

40.16% 

20.03% 

3.01% 

Florida 

67 

22.54% 

37.80% 

15.26% 

6.50% 

Massachusetts 

14 

36.53% 

37.06% 

0.53% 

4.23% 

Iowa 

99 

27.96% 

28.19% 

0.23% 

3.35% 

Nebraska 

93 

2.10% 

27.01% 

24.91% 

4.17% 

Tennessee 

95 

12.75% 

22.91% 

10.16% 

5.10% 

Washington 

39 

22.08% 

20.16% 

-1.92% 

5.36% 

South  Carolina 

46 

10.95% 

19.38% 

8.43% 

4.17% 

Kansas 

105 

6.67% 

18.68% 

12.01% 

3.36% 

Mississippi 

82 

15.55% 

18.02% 

2.47% 

2.10% 

Georgia 

159 

9.21% 

16.41% 

7.20% 

4.43% 

Vermont 

14 

14.95% 

15.41% 

0.46% 

4.28% 

Minnesota 

87 

5.53% 

15.38% 

9.85% 

4.14% 

California 

58 

13.23% 

14.81% 

1.58% 

7.11% 

Illinois 

102 

0.24% 

14.59% 

14.35% 

2.92% 

Nevada 

17 

1.82% 

11.71% 

9.89% 

5.35% 

Wyoming 

23 

6.24% 

10.74% 

4.50% 

2.01% 

Ohio 

88 

6.11% 

10.52% 

4.41% 

3.84% 

Texas 

254 

6.73% 

9.87% 

3.14% 

7.37% 

Missouri 

115 

5.68% 

9.41% 

3.73% 

2.32% 

Louisiana 

64 

8.02% 

9.16% 

1.14% 

3.46% 

Colorado 

64 

1.88% 

9.08% 

7.20% 

7.12% 

Kentucky 

120 

4.87% 

8.99% 

4.12% 

3.77% 

North  Carolina 

100 

2.50% 

8.84% 

6.34% 

4.81% 

New  Mexico 

33 

2.02% 

8.19% 

6.17% 

1.58% 

Idaho 

44 

3.42% 

8.02% 

4.60% 

5.06% 

Oklahoma 

77 

1.92% 

7.35% 

5.43% 

4.46% 

Hawaii 

4 

4.47% 

6.41% 

1.94% 

6.31% 

Alabama 

67 

6.75% 

6.12% 

-0.63% 

2.07% 

Arkansas 

75 

3.28% 

5.63% 

2.35% 

0.93% 

Wisconsin 

72 

1.95% 

4.99% 

3.04% 

2.85% 

New  Hampshire 

10 

1.12% 

2.60% 

1.48% 

3.06% 

Michigan 

83 

1.29% 

2.46% 

1.17% 

5.58% 

Arizona 

15 

6.64% 

2.38% 

-4.26% 

7.68% 

Alaska 

29 

1.82% 

1.99% 

0.17% 

-1.53% 

West  Virginia 

55 

0.30% 

1.93% 

1.63% 

1.51% 

Maine 

16 

0.27% 

0.73% 

0.46% 

2.00% 
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Figure  1. 

Access  to  1 gig/sec  Download  Speed  by  County  (2011) 


Source:  National  Broadband  Map 


Figure  2. 

Access  to  1 gig/sec  Download  Speed  by  County  (201S) 


Source:  National  Broadband  Map 
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Column  (4)  adds  in  a control  for  access  to  cable  internet.  This 
ensures  that  any  association  between  access  to  fiber  internet  and 
employment  growth  is  not  actually  due  to  the  relationship  between 
employment  and  increased  access  to  internet  in  general.  The  coefficient  on 
cable  is  surprisingly  not  statistically  significant.  Based  on  the  body  of 
literature,  a positive  and  statistically  significant  coefficient  on  access  to 
cable  internet  was  anticipated.  A possible  explanation  for  this  could  be 
that  during  the  time  period  in  question,  roughly  $5  billion  in  stimulus 
funding  was  spent  on  expanding  broadband  access,  much  of  which  was 
spent  on  expanding  access  to  cable  internet  in  rural  areas  of  the  country. 
These  areas  that  did  not  already  have  access  to  broadband  likely  were 
some  of  the  hardest  hit  and  last  to  recover  from  the  recession,  explaining 
why  they  lag  behind  in  employment  growth  while  experiencing  an 
increase  in  access  to  cable  internet. 

Columns  (5)  and  (6)  add  in  time-fixed  effects.  This  is  particularly 
important  as  the  country  was  recovering  from  the  Great  Recession  during 
this  time,  so  employment  growth  could  be  the  result  of  a generally  positive 
economic  trend.  The  time  fixed  effects  may  account  for  part  of  the 
coefficient  for  access  to  fiber  internet,  yet  this  coefficient  is  still 
statistically  significant  at  a 99  percent  confidence  level.  Finally,  column 
(6)  adds  in  robust  standard  errors  to  control  for  potential 
heteroscedasticity.  A control  variable  for  state  level  stimulus  spending 
delivered  through  the  National  Telecommunications  and  Information 
Association  (NTIA)  was  also  used,  although  not  shown.  Adding  in  the 
control  for  NTIA  stimulus  spending  had  almost  no  impact  to  any  of  the 
other  coefficients,  perhaps  because  the  only  available  data  are  not  at  the 
county  level  or  accurate  enough  in  terms  of  timing  of  implementation. 

Similar  models  were  run  for  the  other  dependent  variables  of 
interest:  total  establishment  count,  total  average  weekly  wages,  private 
sector  employment,  private  establishment  count,  and  private  average 
weekly  wages;  the  results  for  model  (6)  are  shown  in  Table  6.  For  the 
wage  models,  median  household  income  is  replaced  by  the  log  of  total 
employment.  Statistically  significant  coefficients  are  found  for  total  and 
private  employment  and  total  and  private  establishment  count.  The 
coefficients  on  average  weekly  wages  were  significant  until  time-fixed 
effects  were  added  in,  which  soaked  up  most  of  the  coefficient  and 
significance.  Since  the  wages  are  in  nominal  values,  the  relationship 
depicted  prior  to  adding  time-fixed  effects  was  likely  due  to  inflation.2 
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Table  5.  Regression  Results  for  Total  Employment 


Variables 

Mean 

(1) 

(2) 

(3) 

(4) 

(5) 

(6) 

Log  Total 
Employment 

9.134 

%of 

Households 
w/  Access  to 
Fiber 

0.124 

0.0369*** 

0.0358*** 

0.0263*** 

0.0263*** 

0.0132*** 

0.0134*** 

(10.49) 

(10.20) 

(7.496) 

(7.489) 

(3.696) 

(2.80) 

Log 

(Population) 

10.27 

0.163*** 

0.115*** 

0.115*** 

0.164*** 

0.164*** 

(9.232) 

(6.463) 

(6.457) 

(9.142) 

(3.72) 

% of  Pop  w/ 
Bachelors  or 
Higher 

0.168 

0.430*** 

0.430*** 

0.358** 

0.356 

(2.596) 

(2.595) 

(2.181) 

(1.01) 

Median 

Household 

Income 

45,883 

2.64e-06*** 

2.64e-06*** 

8.72e-07*** 

8.74e-07* 

(17.70) 

(17.65) 

(4.710) 

(1.82) 

% of 

Households 
w/ Access  to 
Cable 

0.566 

0.000447 

-0.00197 

-0.00202 

(0.119) 

(-0.525) 

(-0.36) 

Constant 

9.129*** 

7.450*** 

7.755*** 

7.755*** 

7.337*** 

7.345*** 

(17,419) 

(40.97) 

(42.84) 

(42.83) 

(40.35) 

(16.64) 

Observations 

18,848 

18,848 

18,848 

18,848 

18,848 

18,848 

R-squared 

0.0170 

0.9491 

0.9258 

0.9258 

0.9550 

0.9549 

F stat 

131.68 

97.95 

131.68 

105.34 

86.12 

46.62 

Number  of 

counties 

3,142 

3,142 

3,142 

3,142 

3,142 

3,142 

t-statistics  in  parentheses 
***  p<0.01,  **  p<0.05,  * p<0.1 


Overall,  the  results  show  evidence  of  a strong  positive  correlation 
between  the  percent  of  households  that  have  access  to  optical  fiber  internet 
in  a county  and  the  number  of  employed  individuals  and  number  of  firms. 
Specifically,  a 10  percent  increase  in  the  percent  of  households  with 
access  to  fiber  internet  is  associated  with  a 0.13  percent  increase  in  total 
employment  and  a 0.1  percent  increase  in  the  number  of  firms.  There  is  no 
evidence  of  a relationship  between  access  to  fiber  internet  and  average 
weekly  wages  within  a county. 
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Table  6.  Coefficient  on  Percent  of  Households  with  Access  to  Fiber  Internet  for  each 

Dependent  Variable* 


Coefficient 

T-stat 

R-squared 

F stat 

Log  Total  Employment 

0.01340 

2.80*** 

0.9549 

46.62 

Log  Total  Establishment  Count 

0.01030 

2.26** 

0.9315 

30.25 

Log  Total  Average  Weekly  Wages 

0.00061 

0.17 

0.0165 

540.63 

Log  Private  Employment 

0.01580 

2.65*** 

0.9389 

66.82 

Log  Private  Establishment  Count 

0.01068 

2.09** 

0.9188 

41.43 

Log  Private  Average  Weekly  Wages 

0.00293 

0.68 

0.0535 

847.74 

*Each  model  controls  for  2- way  fixed  effects  (county  and  date)  and  demographics. 


Without  a controlled  or  quasi-experiment,  a causal  relationship 
between  access  to  fiber  internet  and  employment  growth  cannot  be 
claimed,  but  the  results  shown  do  support  the  theory  that  installing  fiber 
internet  can  help  job  growth.  While  controlling  for  time  and  unit  fixed 
effects  and  other  controls  helps  to  isolate  the  relationship  between  access 
to  fiber  and  employment  growth,  there  is  still  the  possibility  that  there  are 
unmeasured  factors  that  influence  both  access  to  fiber  and  job  growth. 
While  state  level  NTIA  stimulus  spending  is  controlled  for,  county  level 
spending  cannot  be  controlled  for  due  to  data  limitations.  This  creates  a 
slight  problem;  while  the  source  of  funding  for  the  increase  in  access  to 
fiber  is  not  the  topic  of  this  paper,  stimulus  funds  had  a specific  goal  of 
creating  jobs  and  contractors  typically  had  to  lay  out  a plan  for  hiring 
additional  employees  as  a part  of  their  bid  for  stimulus  funding.  Therefore, 
if  some  of  the  infrastructure  that  led  to  the  increase  in  fiber  access  was 
because  of  stimulus  spending,  it  may  have  created  more  jobs  than  private 
investment,  which  does  not  have  to  meet  any  job  creation  criteria.  While  a 
better  control  for  this  would  be  ideal,  it  is  unlikely  that  this  is  the  primary 
cause  of  the  positive  relationship.  As  mentioned  previously,  most  of  the 
broadband  stimulus  spending  went  to  expanding  access  to  cable 
technologies,  not  optical  fiber. 

Additionally,  there  is  a possibility  that  job  growth  is  driving  access 
to  fiber  internet,  rather  than  the  other  way  around.  The  positive 
relationship  could  be  due  to  internet  service  providers  expanding  into 
growing  areas.  While  it  is  likely  that  some  of  the  positive  relationship  can 
be  attributed  to  this,  it  is  unlikely  to  be  the  primary  reason.  Most  of 
America  still  is  without  access  to  fiber  internet,  so  service  providers  would 
be  more  likely  to  invest  in  areas  where  they  already  see  demand  rather 
than  trying  to  predict  where  employment  growth  will  be.  Additionally, 
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laying  the  infrastructure  for  fiber  internet  takes  time  and  planning;  since 
this  model  looks  at  six  month  intervals,  it  is  unlikely  that  service  providers 
saw  employment  growth  in  an  area  and  were  able  to  move  in  and  offer 
fiber  service  within  six  months. 

A final  critique  of  the  model  could  be  the  relatively  short-term 
time  frame  used.  Policymakers  are  not  concerned  with  much  longer  time 
frames  than  two  years  when  investing  heavily  in  internet  technologies. 
Unfortunately,  the  relative  newness  of  the  National  Broadband  Map  data 
set,  and  limitations  of  previous  data  collection  efforts,  limit  the  years  that 
can  be  examined.  As  data  collection  efforts  continue,  researchers  should 
continue  to  evaluate  this  relationship  to  test  whether  or  not  better  internet 
leads  to  sustained  growth,  or  if  growth  is  merely  temporary. 

Conclusion:  Policy  Relevance 

This  study  provides  evidence  that  increasing  access  to  state  of  the 
art  internet  like  optical  fiber  and  employment  growth  are  related. 
Policymakers  considering  investments  in  improving  internet  technologies 
might  consider  these  results  when  debating  whether  or  not  the  cost  of  the 
investment  is  appropriate.  This  information  is  useful  to  policymakers  at  all 
levels  of  government,  who  have  taken  a variety  of  approaches  to 
improving  access  to  ultra-high  speed  internet  networks. 

In  January  of  2015,  the  FCC  changed  its  definition  of  broadband 
internet  from  offering  download  speeds  of  4 Mbit/sec  or  greater  to 
offering  much  faster  download  speeds  of  25  Mbit/sec  or  greater.  This  was 
a highly  contentious  shift  in  policy  that  will  impact  how  data  are  collected 
and  what  networks  qualify  for  future  public  investments.  Additionally,  it 
may  change  how  the  FCC  views  the  state  of  competition  within  the 
telecommunications  industry,  which  could  lead  to  other  legislative, 
executive  or  even  judicial  actions  (Brodkin,  2014).  While  this  study  does 
not  address  whether  or  not  25  Mbit/sec  internet  fosters  more  economic 
growth  than  4 Mbit/sec  internet,  it  does  provide  preliminary  evidence  that 
there  could  be  a public  interest  in  promoting  faster  internet  speeds.  This 
contradicts  what  many  of  the  detractors  of  the  FCC’s  change  in  definition 
have  argued;  that  the  internet  is  fast  enough  and  people  do  not  benefit  any 
more  from  speeds  over  25  Mbit/sec  than  they  would  at  lower  levels. 

Another  contentious  policy  area  has  been  the  recent  development 
of  local  (partially  or  fully)  tax-payer  funded  high  speed  fiber  networks 
which  offer  internet  speeds  of  up  to  1 Gbit/sec  (O’Toole,  2014).  In 
response  to  these  networks,  some  states  have  considered  blocking  these 
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efforts  in  order  to  prevent  municipalities  from  crowding  out  private 
expansion  into  high-speed  internet  markets.  This  paper  does  not  provide  a 
cost  benefit  analysis  of  publicly-owned  fiber  optic  networks,  but  does 
provide  evidence  that  policymakers  should  consider  when  deciding 
whether  or  not  these  municipal  fiber  networks  are  wise  uses  of  taxpayer 
funds.  On  the  other  hand,  though,  the  non-significant  coefficient  on  access 
to  cable  internet  may  provide  evidence  that  pushing  internet  technologies 
into  underserved  regions  may  not  unilaterally  lead  to  economic  growth.  A 
more  thorough  examination  of  the  specific  investments  made  during  the 
stimulus  act  would  provide  better  insight  into  this,  though,  as  that  was  not 
the  primary  focus  of  this  paper. 

As  part  of  the  American  Recovery  and  Reinvestment  Act,  the  FCC 
developed  the  National  Broadband  Plan  which  outlines  goals  for  internet 
infrastructure  in  America.  In  the  plan,  the  FCC  set  ambitious  long-term 
goals  including  providing  affordable  access  to  internet  with  speeds  of  100 
Mbit/sec  or  greater  to  at  least  100  million  homes  and  eventually  ensuring 
that  every  American  has  access  to  affordable  fiber  internet.  The  results  of 
this  paper  show  that  these  are  not  unfounded  goals,  and  there  may  be  a 
public,  economic  interest  in  achieving  the  goals  outlined  in  the  National 
Broadband  Plan. 


Endnotes 


A two-way  fixed  effects  model  controls  for  unmeasured  variables  that  remain 
constant  throughout  the  time  period  for  each  county,  as  well  as  variables  that  are  common 
across  all  units  for  a single  time  period.  These  variables  could  potentially  cause  bias  if 
left  uncontrolled  for. 

2 Diagnostic  tests  indicated  that  fixed  effects  are  preferable  to  random  effects  and 
suggested  that  robust  standard  errors  are  needed  due  to  potential  heteroscedasticity. 
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Benjamin  Banneker  and  Celestial  Navigation: 

Just  How  Did  They  Know  Where  They  Were, 

Then?1 
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Abstract 

Benjamin  Banneker  was  an  American  scientist  of  the  late  eighteenth 
century.  He  was  a self-educated  free  black  and  became  an  expert  in 
astronomy,  mathematics,  and  surveying.  Major  Andrew  Ellicott  asked 
him  to  join  the  team  surveying  the  original  boundaries  that  became 
Washington  D.C.  This  paper  presents  Banneker’s  story  — which  is 
inspiring  for  all  those  who  struggle  against  strong  odds  — and  also 
discusses  the  techniques  used  in  those  days  to  determine  latitude  and 
longitude  for  surveying. 

Introduction 

Benjamin  Banneker  was  bom  in  1731  in  Baltimore  County,  Maryland. 
He  died  in  1 806  in  Baltimore  County,  Maryland.  He  lived  his  entire  life  on 
the  family’s  100-acre  tobacco  farm  near  Oella,  Maryland,  a small  hamlet 
which  is  near  Catonsville,  Maryland.  His  mother,  Mary,  was  a free  black, 
his  father,  Robert,  was  a freed  slave  from  Guinea.  Figure  1 shows  a 
reconstmction  of  his  log  cabin. 


Figure  1.  A reconstruction  of  Banneker’s  cabin. 


1 Presented  at  the  Washington  Academy  of  Sciences  2015  Annual  Meeting  and  Awards 
Banquet,  May  14,  2015. 
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One  might  think  that  free  blacks  were  extremely  rare.  That  is 
almost  true.  The  state  of  Maryland  had  the  largest  number  of  free  blacks  of 
any  of  the  states  according  to  the  1830  census.  There  were  over  52,000 
free  blacks  in  Maryland  at  that  time. 

Figure  2 shows  a woodcut  of  Banneker.  It  probably  is  somewhat 
idealized.  It  appeared  on  the  cover  of  one  of  his  publications,  and  in  those 
days,  publishers  felt  free  to  embellish  their  publications. 


Figure  2.  Woodcut  of  Benjamin  Banneker.  Over  the  years  the  name  Banneker  has  had 
various  spellings,  and  currently  it  is  Banneker. 
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Banneker  may  have  attended  a few  years  of  a nearby  Quaker 
school;  however,  the  vast  majority  of  his  learning  was  self-taught.  He 
borrowed  books  from  neighbors  and  studied  them  thoroughly.  He  loved 
mathematics  and  astronomy  and  became  more  than  proficient  in  them;  he 
was  a skilled  researcher,  the  equal  of  any  of  his  contemporaries.  Later  in 
life  he  expressed  that, 

“ The  colour  of  the  skin  is  in  no  way  connected  with 
strength  of  the  mind  or  intellectual  powers .” 

At  age  22,  Banneker  built  a wooden  clock  that  struck  the  hours 
throughout  his  life.  Clocks  were  not  common  items  in  the  late  1700s. 
Local  people  came  to  marvel  at  his  remarkable  clock.  He  was  not  the  first 
person  to  build  a clock  in  the  colonies,  but  he  was  one  of  the  rare  few  who 
succeeded  in  doing  so. 

Banneker  believed  in  seeking  peaceful  resolutions  to  conflicts. 
Along  with  Dr.  Benjamin  Rush  in  1792,  he  wrote  a proposal  to  the  Federal 
Government  asking  the  government  to  establish  a Peace  Office  with  equal 
status  to  the  Department  of  War.  Almost  200  years  later,  the  government 
set  up  the  United  States  Institute  of  Peace.  The  U.  S.  Institute  of  Peace 
(USIP)  works  to  prevent,  mitigate,  and  resolve  violent  conflict  around  the 
world.  USIP  does  this  by  engaging  directly  in  conflict  zones  and  by 
providing  analysis,  education,  and  resources  to  those  working  for  peace. 
Created  by  Congress  in  1984  as  an  independent,  nonpartisan,  federally- 
funded  organization,  USIP  has  more  than  300  staff  working  at  the 
Institute’s  Washington,  D.C.  headquarters  and  on  the  ground  in  the 
world’s  most  dangerous  regions. 

A Bit  of  Mathematics 

Benjamin  Banneker  loved  math.  He  taught  himself  algebra, 
geometry,  trigonometry,  and  spherical  trigonometry.  He  drew  great  delight 
from  creating  math  puzzles  and  solving  them.  One  of  his  puzzles  is: 

Divide  60  into  four  such  parts  that  the  first  being  increased 
by  4,  the  second  decreased  by  4,  the  third  multiplied  by  4, 
the  fourth  divided  by  4 such  that  the  sum,  the  difference, 
the  product,  and  the  quotient  shall  be  one  and  the  same 
number.2 


2 The  answer  to  the  puzzle  appears  at  the  end  of  this  paper. 


Spring  2015 


32 


A Bit  of  Astronomy 

Banneker  liked  to  lie  outside  his  log  cabin  all  night  watching  the 
sky.  So  far  as  we  know  he  did  not  own  a telescope,  although  he  knew  how 
to  use  one  (his  neighbors,  the  Ellicotts,  had  telescopes).  Those  math  skills 
he  had  learned  he  applied  to  astronomy.  In  1788,  he  accurately  predicted 
the  solar  eclipse  of  1789.  Predicting  eclipses  had  long  been  the  province  of 
professional  astronomers.  The  techniques  for  doing  so  were  not  commonly 
taught.  He  timed  the  eclipses  of  the  Galilean  satellites  by  Jupiter.  Figure  3 
shows  those  satellites  with  Jupiter. 


Callisto 


Ganymede 


Europa 


lo 


Jupiter 


Figure  3.  The  Galilean  satellites  of  Jupiter. 


Surveying 

President  George  Washington  asked  Major  Andrew  Ellicott  to 
survey  the  land  we  now  call  Washington,  D.C.  It  became  the  District  of 
Columbia  in  1801.  A Commission  was  set  up  to  oversee  the  project.  In 
1791,  Ellicott  asked  Banneker  to  be  part  of  the  survey  team.  Banneker  was 
59  when  he  joined  Ellicott’s  team.  He  was  hired  for  his  astronomical  and 
mathematical  knowledge.  So  he  spent  several  months  slogging  through  the 
swampy  land  putting  down  milestone  markers. 
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The  original  plan  for  Washington,  D.C.,  called  for  10  miles  on  a 
side  square,  equaling  100  square  miles  of  land.  For  every  mile  of  the 
perimeter,  the  survey  team  laid  down  boundary  markers.  Figure  4 shows 
the  original  boundary  marker  at  6980  Maple  Street,  N.W.  Many  of  the 
boundary  markers  have  disappeared  over  the  years,  but  a few  remain. 
Some  of  those  have  Banneker’s  name  on  them. 


Figure  4.  Boundary  Marker  on  Maple  Street,  N.W. 


In  1846,  Washington,  D.C.,  gave  the  Virginia  portion  of  the 
District  of  Columbia  back  to  Virginia,  leaving  the  District  we  have  today. 

George  Washington  asked  l’Enfant  to  design  the  new  city. 
F Enfant  drew  up  a plan  for  the  city  (see  Figure  5),  but  ran  afoul  of  the 
Commission  overseeing  the  project.  He  left  the  project.  Major  Ellicott  then 
drew  up  a city  plan  (see  Figure  6)  that  was  used  to  construct  the  original 
city. 
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Figure  5.  l’Enfant’s  plan  for  the  city. 
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Note  that  Major  Ellicott  set  0°  longitude  at  the  Capitol  Building. 
The  0°  longitude  meridian  was  set  by  many  nations  at  many  places  over 
the  centuries.  In  most  cases,  however,  the  meridian  at  Greenwich, 
England,  was  used  as  0°  longitude  because  England  ruled  the  seas  in  the 
eighteenth  century. 

It  was  not  until  1884  by  international  treaty  that  Greenwich  was 
chosen  as  the  permanent  0°  longitude  — the  Prime  Meridian. 
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Figure  6.  Major  Ellicott’s  city  plan. 


The  Nitty-Gritty  of  1791  Surveying 

To  survey  is  to  measure  the  latitude  and  longitude  of  the  perimeter 
of  the  land  under  consideration.  Today  surveyors  use  GPS,  but  in  1791 
GPS  did  not  exist.  Surveyors  turned  to  events  in  the  sky  to  determine 
latitude  and  longitude.  This  means  astronomy. 
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It  takes  two  coordinates  to  find  a location  on  a sphere  [we  do  all 
agree  that  we  live  on  the  surface  of  a sphere,  the  Earth]:  a left/right  one 
and  an  up/down  one.  Figure  7 illustrates  the  concept. 


Figure  7.  The  concept  of  latitude  and  longitude. 


The  up/down  is  latitude  which  runs  from  0°  at  the  equator  to  ±90° 
at  the  poles.  The  left/right  is  longitude  which  runs  from  0°  to  180°  East 
and  0°  to  180°  West.  We  call  the  latitudes  circles  of  latitude ; we  call  the 
longitudes  meridians  of  longitude.  See  Figure  8 for  an  illustration. 
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Figure  8.  Illustration  of  the  circles  of  latitude  and  meridians  of  longitude. 
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Let  us  start  with  latitude.  People  knew  how  to  find  their  latitude  by 
the  2nd  century  BCE.  Start  with  a sextant  and  sight  along  your  horizon.  In 
the  northern  hemisphere  then  find  Polaris,  the  North  Star  — no  telescope 
needed.  Measure  the  altitude  of  Polaris  (how  far  you  tilt  upward  with  your 
sextant)  with  the  sextant.  Your  latitude  is  90°  minus  the  altitude  of  Polaris. 
It  is  a simple  procedure.  Figure  9 illustrates  the  concept.  Latitude  uses 
angles  ranging  from  0°  at  the  equator  to  ±90°  at  the  poles. 
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Figure  9.  How  to  measure  the  altitude  of  Polaris. 
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Longitude,  on  the  other  hand,  is  not  so  simple.  Someone  must 
define  a starting  point  for  0°  longitude.  The  British  colonies  (and  the  new 
United  States)  used  Greenwich  as  the  Prime  Meridian.  The  time  at 
Greenwich  is  known  as  Greenwich  Mean  Time  (GMT)  or  Universal  Time 
(UT). 


Longitude  uses  both  time  and  angle.  That  begs  the  question  of  how 
longitude  relates  time  to  angle.  The  Earth  is  not  sitting  still;  it  is  spinning 
on  its  axis.  This  does  not  matter  for  measuring  latitude.  It  does  matter  for 
longitude.  The  Earth  rotates  once  around  each  day,  or  360°/day.  A day  has 
24hrs.  Work  your  way  down  from  this  to  the  Earth  rotates  15°  in  lhr.  So  if 
you  are  at  the  equator,  you  are  spinning  with  the  Earth  at  1675  km/hr.  If 
you  can  measure  your  local  time  and  the  time  at  Greenwich,  you  can  get 
your  longitude.  For  example,  if  it  is  1800hrs  at  Greenwich  when  it  is  9h20m 
on  the  same  day,  then  your  local  time  is  8h  and  40m  behind  Greenwich 
Time.  Convert  that  value  to  angles,  and  you  have  your  longitude: 


Long  = - 


(8x15°) 


+ 


( 40 
— xl5° 

Uo 


y 


= -[120°  + 10°]  = -130°  = 130°  West 


However  in  the  1700s,  one  could  not  call  Greenwich  from 
Washington,  D.C.,  and  ask  the  time.  So  the  survey  team  needed  to  use 
events  in  the  sky.  They  did  have  an  ephemeris.  An  ephemeris  is  a time- 
ordered  list  of  future  positions  of  the  Sun,  Moon,  planets,  satellites,  and 
stars  (at  0hrs  GMT).  Astronomers  were  paid  to  develop  ephemerides  for 
various  places.  This  took  a great  deal  of  mathematical  skill.  Most  travelers 
took  an  ephemeris  with  them  on  their  travels. 

An  Almanac  contains  an  ephemeris  along  with  other  important 
information  such  as  holidays,  eclipses,  sunrise,  and  sunset.  The  Federal 
Government  still  publishes  the  Astronomical  Almanac  and  the  Nautical 
Almanac  each  year.  An  example  of  an  ephemeris  for  Mars  for  the  month 
of  June  2015  looks  like  this: 


Target  body 
Start  time 
Stop  time 
Step-size 
Date (UT) 


name:  Mars 

: 2015-May-31  00:00:00.0000  UT 
: 2015-Jun-30  00:00:00.0000  UT 
: 1440  minutes 

HR:MN  R.A.  DEC 
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2015-May-31  00:00 
2015-Jun-01  00:00 
2015-Jun-02  00:00 
2015-Jun-03  00:00 
2015-Jun-04  00:00 
2015-Jun-05  00:00 
2015-Jun-06  00:00 
2015-Jun-07  00:00 
2015-Jun-08  00:00 
2015-Jun-09  00:00 
2015-Jun-10  00:00 
2015-Jun-l  1 00:00 
2015-Jun-12  00:00 
2015-Jun-13  00:00 
2015-Jun- 14  00:00 
2015-Jun-15  00:00 
2015-Jun-16  00:00 
2015-Jun-17  00:00 
2015-Jun- 18  00:00 
2015-Jun-19  00:00 
2015-Jun-20  00:00 
2015-Jun-21  00:00 
2015-Jun-22  00:00 
2015-Jun-23  00:00 
2015-Jun-24  00:00 
2015-Jun-25  00:00 
2015-Jun-26  00:00 
2015-Jun-27  00:00 
2015-Jun-28  00:00 
2015-Jun-29  00:00 
2015-Jun-30  00:00 
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The  coordinates  are  right  ascension  and  declination  — two  common 
astronomical  coordinates. 

There  are  two  important  clocks  one  needs  for  surveying  in  the 
1700s:  a clock  keeping  GMT  and  a clock  keeping  local  time.  If  the  GMT 
clock  stops,  it  is  very  difficult  to  retrieve  the  correct  GMT  (one  cannot  call 
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home).  If  the  local  clock  stops,  it  is  not  as  difficult  to  retrieve  the  local 
time  but  it  does  take  some  effort. 

Banneker  had  the  responsibility  for  keeping  the  clocks  wound. 
This  was  a vital  position  to  have. 

Now  they  needed  a celestial  event  to  time.  There  were  a few 
schemes  used  in  the  1700s  — most  were  not  very  precise.  One  celestial 
event  that  showed  promise  was  the  planet  Jupiter  eclipsing  the  four 
Galilean  satellites  (the  four  brightest  moons  of  Jupiter,  discovered  by 
Galileo),  see  Figure  3.  They  had  an  ephemeris  for  the  eclipse  times  for 
these  satellites.  Today  you  can  use  a smart  phone  app  to  get  the  ephemeris 
for  the  Galilean  satellites. 

Both  the  Mason-Dixon  Line  and  the  boundary  for  Washington, 
D.C.,  were  set  using  the  eclipses  of  the  Galilean  satellites. 

Banneker  worked  on  the  survey  team  for  just  a few  months.  Ill 
health  drove  him  back  to  his  farm.  He  continued  to  compute  ephemerides 
and,  beginning  in  1792,  published  a series  of  six  Almanacs.  They  sold  in 
six  cities  in  four  states  for  the  years  1792  through  1797:  Baltimore; 
Philadelphia,  Pennsylvania;  Wilmington,  Delaware;  Alexandria,  Virginia; 
Petersburg,  Virginia;  and  Richmond,  Virginia.  They  were  best  sellers  at 
the  time.  Today,  very  few  exist.  The  Maryland  Historical  Society  has  one. 
The  cover  for  the  1792  edition  is  shown  in  Figure  10. 

People  who  did  not  have  clocks  depended  on  an  Almanac  to  give 
them  sunrise  and  sunset  times  so  they  could  tell  the  time  of  day. 

Summary 

Benjamin  Banneker  was  an  American  scientist  of  repute.  As  a 
testament  to  his  reputation,  the  Federal  Gazette  wrote  the  following 
obituary:  “Mr.  Banneker  is  a prominent  instance  to  prove  that  a 
descendant  of  Africa  is  susceptible  of  as  great  mental  improvement  and 
deep  knowledge  into  the  mysteries  of  nature  as  that  of  any  other  nation.” 
There  are  some  who  say  that  his  intellect  matched  that  of  Ben  Franklin. 

There  are  many  schools  named  after  Banneker,  and  the  Benjamin 
Banneker  Museum  and  Park  is  maintained  by  Baltimore  Recreation  and 
Parks.  Its  address  is  300  Oella  Avenue,  Catonsville,  MD  21228. 
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Benjamin  Banneker’s 

PENNSYLVANIA,  DELAWARE, 
MARYLAND  and  VIRGINIA 


EPHEMERIS, 

For  the  YEAR  of  our  LORD, 

1792; 

Being  BISSEXTILE,  or  LEAP-YEAR,  and  the  Six- 
teenth Year  of  AMERICAN  INDEPENDENCE, 
which  commenced  July  4,  1776. 

Containing,  the  Motions  of  the  Sun  and  Moon,  thetrut 
Places  and  AfpcCts  of  the  Planets,  the  Riling  and  Setting  of 
the  Sun,  and  the  Riling,  Setting  and  Southing,  Place  and  Age 
of  the  Moon,  See. — The  Lunations,  Conjunctions,  Eclipfes, 
Judgment  of  the  Weather,  Feftivals,  and  other  remarkable 
Days  ; Days  for  holding  the  Supreme  and  Circuit  Courts  of  tin 
United  States , as  alfo  the  ufual  Courts  in  Pennfylvania , Dela- 
ware, Maryland,  and  Virginia. — Also,  feveral  ufcful  Tables, 
and  valuable  Receipts. — Various  Selections  from  the  Com- 
monplace-Book  ot  the  Kentucky  P Lilcf  ] her , an  American  Sage  j 
with  interefting  and  entertaining  Elfays,  in  Profe  and  Verfc — 
the  whole  comprifing  a greater,  more  pleafmg,  and  ufeful  Va 
riety,  than  any  Work  of  the  Kind  and  Price  in  North- America. 


BALTIMORE:  Printed  and  Sold,  Wholefale  and  Retail,  b> 
William  Goddar d and  James  Angell,  at  their  Print- 
ing-Office, in  Market- Street. — Sold,  a!fo,  by  Mr.  Joseph 
Crukshank,  Printer,  in  Market-Street , and  Mr.  Daniei' 

Humphreys,  Printer,  in  Soutb-Frcnt-Street,  Philadelphia 

and  by  MefTrs.  Hanson  and  Bond,  Printers,  in  Alexandria  I 


Figure  10.  1792  cover  for  the  Banneker  Almanac. 
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Bio 

Sethanne  Howard  is  an  astronomer  and  retired  Chief  of  the 
Nautical  Almanac  Office  at  the  U.S.  Naval  Observatory.  She  maintains 
her  research  field  of  interacting  galaxies.  As  the  first  woman  to  receive  a 
bachelor’s  degree  in  physics  from  the  University  of  California,  Davis,  she 
went  on  to  get  a master’s  degree  in  nuclear  physics  from  Rensselaer 
Polytechnic  Institute,  and  a PhD  in  astrophysics  from  Georgia  State 
University.  She  worked  at  several  astronomical  observatories,  at  NASA 
managing  operational  astrophysical  satellites,  at  NSF  as  Program  Officer 
for  Extragalactic  Astronomy  and  Cosmology,  and  finally  at  the  U.S.  Naval 
Observatory. 


The  answer  to  the  math  puzzle  presented 
earlier  in  this  paper  is: 

W = 5.6  is  the  first  part 

X = 13.6  is  the  second  part 

Y = 2.4  is  the  third  part 

Z = 38.4  is  the  fourth  part 

W + X + Y + Z = 60 

W + 4 = X-  4 = Y*4  = Z/4  = 9.6 

One  needs  to  solve  the  set  of  simultaneous 
equations  to  get  the  solution. 
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Washington  Academy  of  Sciences 
Awards  Program  2015 


Background 


The  purpose  of  the  Washington  Academy  of  Sciences,  which  was 
founded  more  than  a century  ago  in  1898,  is  to  encourage  the 
advancement  of  science  and  “to  conduct,  endow,  or  assist  investigation  in 
any  department  of  science.”  To  recognize  scientific  work  of  distinction, 
the  Academy  gives  awards  annually  to  scientists  who  work  in  the  greater 
Washington,  D.C.,  area.  The  awards  are  presented  by  colleagues  at  the 
academy’s  annual  business  meeting  and  awards  ceremony.1  The  public  is 
invited  to  help  celebrate  and  recognize  the  extraordinary  achievements  of 
the  honored  scientists  and  engineers,  so  the  Academy  hosts  a formal 
Business  and  Awards  Banquet  in  the  Washington  area.  At  this  ceremony, 
the  nominating  colleague  gives  a short  3 -minute  introduction  describing 
the  awardee,  and  the  awardee  must  be  present  to  accept  the  award,  but  the 
tradition  of  requiring  formal  acceptance  speeches  ended  back  in  1955. 


Photo:  A1  Teich 


Washington  Academy  of  Sciences  annual  Awards  Banquet  at  the  conference  center  of  the 
National  Rural  Electric  Cooperative  Association  (NRECA)  in  Arlington,  Virginia,  May 

14,  2015. 


1 Per  the  Academy’s  by-laws,  the  annual  business  meeting  takes  place  by  the  third 
Thursday  in  May,  and  usually  consists  of  brief  reports  by  the  outgoing  and  incoming 
presidents  and  an  audit  report. 
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Awards  Program  Early  History 

While  the  Academy  passed  its  centennial  year  in  1998,  the 
Academy’s  Awards  Program  has  featured  75  years  of  achievement  at  this 
point  in  time.  It’s  interesting  to  recount  the  history  of  the  program  which 
began  in  1940.  The  Academy’s  Bylaws  had  been  amended  the  previous 
year  to  permit  the  Academy  to  award  “medals  and  prizes  . . . [for]  scientific 
work  of  high  merit.” 

At  that  time,  1939,  the  Academy’s  Board  of  Managers  established 
awards  for  noteworthy  accomplishments  during  the  year  by  young 
scientists  — no  more  than  40  years  old  — in  the  biological,  physical,  and 
engineering  sciences.  A proposal  to  raise  the  age  limit  to  45  for  the 
Biological,  Engineering,  and  Physical  Sciences  categories  was  rejected  in 
1953.  The  requirement  that  “candidates  shall  not  have  passed  their  41st 
birthday”  was  dropped  later  on  in  1982,  and  1983  was  the  first  year  in 
which  an  award  for  a Distinguished  Career  in  Science  was  given. 

Some  Award  Traditions 

The  year  1956  marked  the  first  year  that  more  than  one  award  was 
presented  in  a given  category.  In  1961,  the  Board  of  Managers  officially 
encouraged  granting  more  than  one  award  in  any  given  category  should 
multiple  qualified  candidates  exist. 

Traditionally,  the  Academy’s  awards  have  been  given  for  work 
done  in  the  Washington,  D.C.,  area.  Since  the  Washington  Academy  of 
Sciences  was  incorporated  in  1898,  the  year  1998  marked  the  Academy’s 
centennial  year  and  the  D.C.-area  tradition  was  waived  during  that  year  — 
as  some  awards  were  given  to  individuals  affiliated  with  organizations 
outside  the  D.C.  area."  In  1998,  the  Academy  gave  fourteen  Centennial 
Awards  for  Lifetime  Achievement  in  Science,  including  awards  in  Science 
Policy,  Technology  Policy,  and  History  of  Science.  The  next  year,  1999, 
was  the  first  non-centennial  year  in  which  the  award  for  Science  Policy 
was  given.  The  History  of  Science  award  was  not  given  again  until  2012, 
which  was  the  same  year  Service  to  Science  was  first  awarded.  An  award 
for  Lifetime  Achievement  in  the  Public  Understanding  of  Science  was  first 
made  in  2014. 


2 The  Washington  Academy  of  Sciences  founders  included  Alexander  Graham  Bell  and 
Samuel  Langley,  Secretary  of  the  Smithsonian  Institution  from  1887  to  1906. 
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Establishment  of  the  Education  and  Teaching  Awards 

The  first  year  an  award  was  given  for  the  Teaching  of  Science  was 
1952.  For  this  award  category,  and  for  this  category  only,  the  age 
limitation  of  40  years  was  waived.  This  “special  award”  was  given  in  1952 
and  1953.  The  award  category  for  the  Teaching  of  Science  was  officially 
established  in  1956. 

In  1976,  the  Berenice  Lamberton  Award  for  the  Teaching  of 
Science  in  High  Schools  was  established.  Lamberton  was  a professor  at 
Georgetown  University  with  a long-time  interest  in  education.  The 
Washington  Academy  of  Sciences’  Junior  Academy  was  initially  set  up  by 
Lamberton  and  others  at  Georgetown. 

The  Leo  Schubert  Award  for  College  Teaching  was  established  in 
1979.  The  year  2000  was  the  first  year  in  which  awards  were  given  for 
Achievement  in  Education  and  Teaching  of  Science  in  Middle  Schools. 

In  2002,  the  Board  of  Managers,  acting  on  the  recommendation  of 
the  Awards  Committee,  established  the  Marilyn  Krupshaw  Award  for 
Non-Traditional  Education/Teaching.  The  award  was  named  in  honor  of 
the  long-time  leader  of  George  Washington  University’s  Science  and 
Engineering  Apprentice  Program  (SEAP)  for  high  school  students, 
sponsored  by  the  U.S.  Department  of  Defense  (DoD).  The  award  was 
presented  for  the  first  time  in  2004.  And  in  2005,  a special  award  was 
given  for  Service  to  Science  Education. 

Nomination  Process 

The  Academy  welcomes  nominations  for  its  Awards  Program  each 
year.  The  following  is  a complete  list  of  the  award  categories  as 
established  by  the  Academy’s  Board  of  Managers: 

• Distinguished  Career  in  Science 

• Biological  Sciences 

• Engineering  Sciences 

• Physical  Sciences 

• Health  Sciences 

• Behavioral  and  Social  Sciences 

• Mathematics  and  Computer  Science 

• Krupsaw  Award  for  Non-Traditional  Teaching 

• Lamberton  Award  for  Teaching  of  Science  in  High  School 

• Leo  Schubert  Award  for  Teaching  of  Science  in  College 
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• Special  Award  ( e.g .,  science  policy,  lifetime  achievement  in 
education) 

To  carry  out  the  Awards  Program  each  year,  the  Academy’s  Vice 
President  for  Membership  appoints  an  Awards  Committee  which  sets  the 
submission  dates  for  the  year.  Please  watch  the  Academy’s  website, 
www.washacadsci.org,  for  those  deadlines,  typically  in  early  Spring.  To 
nominate  an  individual,  print  and  complete  the  Nomination  Form  that  is 
available  at  the  website,  and  mail  it  directly  to  the  Awards  Committee  as 
indicated  on  the  form. 

The  Awards  Committee  typically  uses  the  standard  categories  that 
appear  on  the  nomination  form,  but  when  necessary,  the  Special  Award 
category  may  be  used  to  include  other  categories.  For  example,  an  award 
for  Mathematics  was  first  given  in  1960.  This  other  award  category  was 
expanded  to  Mathematics  and  Computer  Sciences  in  1979.  The  Academy 
first  made  an  award  for  Behavioral  Sciences  in  1976;  this  other  award 
category  was  changed  to  Behavioral  and  Social  Sciences  in  1987. 4 Awards 
were  first  given  for  two  other  categories  — Health  Sciences  and 
Environmental  Science  — in  1997,  and  later  for  Public  Health.  The  year 
2000  was  the  first  year  awards  were  given  for  the  categories  of 
Anthropology  and  Astronomy.  Back  in  1961,  the  Board  rejected  a 
proposal  that  an  Earth  Sciences  award  category  be  instituted;  it  was  not 
until  2014  that  an  award  for  Lifetime  Achievement  in  Natural  Resources 
Sciences  was  made. 


2015  Annual  Banquet 

At  the  Washington  Academy  of  Sciences  Annual  Business  and 
Awards  Banquet  on  May  14,  2015,  the  Academy’s  ceremony  honored  an 
illustrious  group  of  individuals  for  their  work  in  physical,  biological,  and 
engineering  sciences  and  other  areas. 

Ronald  Colie  received  the  2015  award  for  Distinguished  Career  in 
Science  in  recognition  of  his  “lifetime  work  and  major  contributions  in 
radionuclidic  metrology.  Within  the  world  of  radioactivity  measurements, 
it  is  almost  impossible  to  hear  the  words  ‘radon,’  ‘uncertainty’  or 


3 Since  1 990,  additional  awards  have  also  been  given  at  the  annual  awards  ceremony  for 
special  recognition  of  Service,  or  Meritorious  Service,  to  the  Washington  Academy  of 
Sciences;  however,  these  awards  go  through  a different  process. 

4 Ainitai  Etzioni,  noted  earlier  in  this  issue  of  the  Journal  of  the  Washington  Academy  of 
Sciences,  was  a recipient  of  this  award  in  1988. 
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‘metrologisf  without  thinking  of  the  name  Dr.  Colie.”  He  is  a specialist  in 
nuclear  radiochemistry  and  the  development  of  standards,  and  he  and  his 
collaborators  developed  methods  to  analyze  and  standardize 
brachytherapy  sources,  pellets  of  radioactive  material  designed  to  be 
implanted  in  the  body  at  sites  requiring  direct  radiation  exposure. 

The  Academy  presented  its  2015  award  for  Distinguished  Career 
in  Engineering  Sciences  to  Dr.  Ram  Duvvuru  Sriram  in  recognition  of 
“contribution  and  technical  leadership  in  developing  computational  tools 
and  techniques  for  engineering  design  and  for  enabling  interoperability  of 
CAD/CAM/CAE  systems.” 

A 2015  award  for  Physical  and  Biological  Sciences  was  presented 
to  Dr.  Marcus  Cicerone  in  recognition  of  “establishing  and  pioneering 
the  use  of  Broad  Band  Coherent  Anti-Stokes  Raman  Spectroscopy 
imaging  and  establishing  exquisite  optical  techniques  for  examining  the 
dynamics  of  proteins  and  other  biological  molecules  in  the  glassy  sugar 
matrices  commonly  used  for  their  preservation.” 

The  Academy’s  2015  award  for  Biological  Sciences  was  presented 
to  Dr.  Paul  M.  Peterson  in  recognition  of  being  a “tireless  and  prolific 
taxonomist,  collector,  and  publisher  who  has  extensively  revised  the 
classification  of  the  large  grass  subfamily  Chloridoideae  and  its  genera, 
and  is  leading  the  effort  to  prepare  a DNA  database  for  the  grasses  of 
North  America  and  noxious  weeds  for  the  Bar  Code  of  Life.” 

The  Academy  presented  its  2015  award  for  Engineering  Sciences 
to  Dr.  Robert  Gover  in  recognition  of  “work  at  the  Naval  Research 
Laboratory  on  the  development,  implementation,  and  application  of  high- 
fidelity  physics-based  digital  models  for  the  development  of  optimized 
Electronic  Warfare  countermeasures  against  modem  anti-shipping  cruise 
missiles.” 

A 2015  award  for  Physical  Sciences  went  to  Mr.  Gregory  Strouse 
in  recognition  of  “international  leadership  in  high-precision  temperature 
metrology,  and  innovative  contributions  to  next-generation  temperature 
sensors.” 

The  Krupsaw  Award  for  Non-Traditional  Teaching  was  presented 
in  2015  to  Ms.  MaryBeth  Petrasek  in  recognition  of  her  “teachings  in  the 
techniques  of  medicolegal  death  investigation  and  forensic  pathology  to 
young  people.” 

A 2015  award  was  also  presented  to  Dr.  Sally  Rood  in  special 
recognition  of  Service  to  the  Academy  for  “momentous  work  as  editor  of 
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the  Journal  of  the  Washington  Academy  of  Sciences  and  coordination  of 
the  agreement  to  begin  the  process  of  digitally  preserving  more  than  100 
years  of  the  Journal’s  published  works.” 


Photo:  A1  Teich 

Lisa  Karan  presenting  the  award  for  Distinguished  Career  in  Science  to  Ronald  Colie. 


Photo:  A1  Teich 


Award  for  Distinguished  Career  in  Engineering  Sciences  presented  to  Ram  D.  Sriram 

(right)  by  Steven  Fenves. 
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Photo:  A1  Teich 


Award  for  Physical  and  Biological  Sciences,  presented  to  Marcus  Cicerone  by  Laurie 

Locascio. 


Photo:  A1  Teich 

Award  for  Biological  Sciences,  presented  to  Paul  Peterson  (left)  by  Chris  Puttock. 
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Photo:  A1  Teich 

Award  for  Engineering  Sciences,  presented  to  Robert  Gover  (left)  by  Douglas  Fraedrich. 


Award  for  Physical  Sciences,  presented  to  Gregory  Strouse  (right)  by  Gerald  Fraser. 
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Photo:  A1  Teich 

Krupsaw  Award  for  Non-Traditional  Teaching,  presented  to  MaryBeth  Petrasek  by  Anne 

Cupero  (left). 


Photo:  A1  Teich 


Special  recognition  for  Service  to  the  Washington  Academy  of  Sciences  was  presented  to 
Sally  Rood  (right)  by  Master  of  Ceremonies  Terrell  Erickson. 


Spring  2015 


52 


Washington  Academy  of  Sciences 


53 


1 200  New  York  Ave. 
Suite  113 
Washington  DC 
20005 

wvwv.  wa  shacadsci.org 


Addendum*  to 

Washington  Academy  of  Sciences 
2014  Membership  Directory 


M=Member;  F=Fellow;  LF=Life  Fellow;  LM=Life  Member; 
EM=Emeritus  Member;  EF=Emeritus  Fellow 


Adkins,  Michael  K.  (Mr.)  4143  Elizabeth  Lane,  Annandale  VA  22003 
(M) 

Arif,  Muhammad  (Dr.)  National  Institute  of  Standards  and  Technology 
(NIST),  100  Bureau  Drive,  MS  8460,  Gaithersburg  MD  20899-8460  (M) 

Berry,  Jesse  F.  (Mr.)  2601  Oakenshield  Drive,  Rockville  MD  20854 
(M) 

Boisvert,  Ronald  F.  (Dr.)  National  Institute  of  Standards  and  Technology 
(NIST),  100  Bureau  Drive,  MS  8910,  Gaithersburg,  MD  20899-8910  (F) 

Brown,  Elise  A.  B.  (Dr.)  681 1 Nesbitt  Place,  Mclean  VA  22101-2133 
(LF) 


* These  twenty  names  were  inadvertently  omitted  from  the  Academy’s  2014 
Membership  Directory  in  the  Winter  2014  issue  of  the  Journal  of  the  Washington 
Academy  of  Sciences,  so  we  are  printing  them  here  instead  of  waiting  to  include  them  in 
the  2015  membership  listing.  As  indicated  in  the  inside  cover  of  each  quarterly  Journal, 
the  last  issue  of  the  year  contains  a directory  of  the  current  membership  of  the  Academy. 
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Buford,  Marilyn  (Dr.)  3073  White  Birch  Court,  Fairfax  VA  22031  (F) 

Caws,  Peter  J (Dr.)  2475  Virginia  Avenue,  NW,  Apt.  230,  Washington 
DC  20037  (M) 

Cupero,  Jerri  Anne  (Dr.)  2860  Graham  Road,  Falls  Church  VA  22042 

(F) 

Danner,  David  L.  (Dr.)  1364  Beverly  Road,  Suite  101,  McLean  VA 
22101  (M) 

Elster,  Eric  Andrew  (Dr.)  3223  Geiger  Avenue,  Kensington  MD  20895 

(F) 

Hollinshead,  Ariel  (Mrs.)  23465  Harbor  View  Road,  #622,  Punta  Gorda 
FL  33980-2162  (F) 

Jayarao,  Arundhati  (Dr.)  881 1 Trafalgar  Court,  Springfield  VA  22151 
(M) 

Kaufhold,  John  (Dr.)  4601  N.  Fairfax  Drive,  Suite  1200,  Arlington  VA 
22203  (M) 

Martin,  Charles  R.  (Dr.)  P.O.  Box  98521,  M/S  NLV085,  Las  Vegas  NV 
89193  (F) 

Mittleman,  Don  (Dr.)  4650  54th  Avenue  S.,  Apt.  57B,  St.  Petersburg  FL 
33711-4638  (F) 

O’Hare,  John  J.  (Dr.)  108  Rutland  Boulevard,  West  Palm  Beach  FL 
33405-5057  (EF) 

Sozer,  Amanda  (Dr.)  4707B  Eisenhower  Avenue,  Alexandria  VA 
22304  (M) 

Snieckus,  Mary  (Ms.)  1700  Dublin  Drive,  Silver  Spring  MD  20902  (M) 
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Williams,  Jack  (Dr.)  6022  Hardwick  Place,  Falls  Church  VA  22041  (F) 


Williams,  Tenisha  (Ms.)  1209  7th  Street,  NW,  Washington  DC  20001 
(M) 


Wu,  Keli  (Mr.)  360  Swift  Avenue,  Suite  48,  South  San  Francisco  CA 
94080  (M) 
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In  Memoriam 

Burton  G.  Hurdle 
(1918-2015) 

Burton  Garrison  Hurdle,  a Fellow  of  the  Washington  Academy 
of  Sciences,  passed  away  peacefully  on  March  4,  2015.  He  was  a research 
physicist  with  the  Naval  Research  Laboratory’s  Acoustics  Division 
beginning  in  1943  for  50  years. 

Hurdle  was  bom  in  1918  in  Roanoke,  Virginia,  the  son  of  Grover 
Cleveland  Hurdle  and  Bronna  Rene  (Garrison)  Hurdle.  He  was  raised 
during  the  Great  Depression  and  graduated  from  Jefferson  Senior  High 
School  in  Roanoke  in  June  1936.  After  graduation  from  high  school,  he 
went  to  work  for  the  Norfolk  and  Western  Railway  that  had  its 
headquarters  in  Roanoke.  In  that  period,  he  started  taking  night  classes  and 
then  switched  to  becoming  a full  time  undergraduate  student  at  Roanoke 
College. 

In  1941,  he  received  his  B.S. 
degree  in  physics,  with  a minor  in 
mathematics,  and  then  enrolled  at 
Virginia  Polytechnic  Institute  for 
graduate  studies.  He  intended  to 
major  in  mechanical  engineering  at 
Virginia  Tech,  but  after  only  two 
weeks  decided  to  major  once  again  in 
physics.  While  he  was  studying  for 
his  Master’s  Degree  in  physics  he 
taught  some  classes  in  the 
university’s  Mathematics  Department 
to  supplement  his  income.  He  also 
had  an  industrial  fellowship  with  the 
Standard  Register  Company  of 
Dayton,  Ohio.  Although  he  was 
within  about  a year  and  a half  from 
receiving  a Doctorate  in  Physics,  he 
left  the  university  to  join  the  U.S.  Navy  for  the  war  effort,  and  so  was 
awarded  a M.S.  degree  in  General  Physics  at  that  time.  While  at  Virginia 
Tech,  he  had  interviewed  with  recruiters  from  the  Naval  Research 
Laboratory  (NRL).  After  considering  several  other  potential  job 
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opportunities,  he  accepted  a position  in  the  NRL  Sound  Division  as  a 
Research  Physicist  and  started  work  there  in  1943.  NRL  was  doing  much 
applied  research  then  in  support  of  the  War  effort.  His  doctoral  thesis  topic 
was  on  the  subject  of  acoustic  interference  fields  in  the  ocean.  Hurdle 
worked  under  all  five  Superintendents  of  the  Acoustics  Division.  His  first 
supervisor  at  NRL  was  Dr.  Raymond  Steinberger,  and  his  early  senior 
NRL  colleagues  included  Harvey  Hayes,  Raymond  Steinberger,  and 
Prescott  Arnold  who  were  all  Harvard-educated  scientists. 

Hurdle  briefly  left  NRL  during  the  period  1947-1949  to  work  at 
Engineering  Research  Associates’  Physics  and  Chemistry  Division  in 
Arlington,  Virginia.  During  this  period,  he  worked  on  several  research 
projects  including  investigations  of  the  sound  speed  and  absorption  in 
liquids  using  an  interferometer;  development  of  methods  for  calibration  of 
accelerometers  using  free-free  bars;  and  development  of  methods  for 
calibrating  acoustic  pressure  gauges  and  impulse  gauges  for  use  in 
measuring  the  propagation  of  elastic  energy  in  soil  and  rock. 

Hurdle  completed  his  Ph.D.  at  a later  time,  during  work  in  the 
United  Kingdom. 

In  addition  to  being  a Fellow  of  the  Washington  Academy  of 
Sciences,  Dr.  Hurdle  was  also  a Fellow  of  the  Acoustical  Society  of 
America  (ASA).  He  served  the  ASA  in  various  capacities  including  the 
Membership  Committee,  the  Underwater  Acoustics  Technical  Committee, 
the  Nominating  Committee,  and  the  Publications  Policy  Committee. 

Dr.  Hurdle  was  also  a member  of  Sigma  Xi.  He  served  as 
Associate  Editor  of  the  U.S.  Navy’s  Journal  of  Unden\>ater  Acoustics 
(1979-2004).  He  also  served  as  General  Chairman  and  Session  Chairman 
at  meetings  of  the  U.S.  Navy  Symposia  on  Underwater  Acoustics.  Dr. 
Hurdle  received  numerous  awards  and  commendations  including  the  Alan 
Berman  Research  Publication  Award  for  “The  Nordic  Seas”  in  1985  and 
the  Navy  Superior  Civilian  Service  Award  in  1987.  In  1998,  he  was  the 
recipient  of  the  Distinguished  Technical  Achievement  Award  from  the 
Oceanic  Engineering  Society  (OES)  of  the  Institute  of  Electrical  and 
Electronic  Engineers  (IEEE).  He  was  cited  for  his  outstanding 
contributions  to  understanding  the  oceanography  and  acoustics  of  the 
Nordic  Seas. 
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Washington  Academy  of  Sciences 

1200  New  York  Avenue,  NW 
Room  1 1 3 

Washington,  DC  20005 


Membership  Application 

Please  fill  in  the  blanks  and  send  your  application  to  the  Washington 
Academy  of  Sciences  at  the  address  above.  We  will  contact  you  as  soon  as 
your  application  has  been  reviewed  by  the  Membership  Committee.  Thank 
you  for  your  interest  in  the  Washington  Academy  of  Sciences. 

(Dr.  Mrs.  Mr.  Ms.)  

Business  Address  

Home  Address  

Email 


Phone 


Cell 


Please  indicate: 


Preferred  mailing  address 
Type  of  membership 


Business 

Regular 


Home 

Student 


Schools  of  Higher  Education  Attended 

Degree 

Dates 

Present  Occupation  or  Professional  Position 

Please  list  memberships  in  scientific  societies  - and  include  office  held: 
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Journal  of  the  Washington  Academy  of  Sciences 

Instructions  to  Authors 

The  Washington  Academy  of  Sciences  publishes  its  interdisciplinary  peer- 
reviewed  Journal  of  the  Washington  Academy  of  Sciences  four  times  a 
year  — Spring,  Summer,  Fall,  and  Winter. 

1 . Deadlines  for  quarterly  submissions  are: 

Spring  - February  1 Fall  - August  1 

Summer  - May  1 Winter  - November  1 

2.  Draft  manuscripts  using  a word  processing  program  (such  as 
MSWord),  not  PDF. 

3.  Papers  should  be  6,000  words  or  fewer.  With  7 or  more  graphics, 
reduce  the  number  of  words  by  500  for  each  graphic. 

4.  Include  an  abstract  of  150-200  words. 

5.  Graphics  must  be  in  black  & white  or  greytone.  They  must  be 
referenced  in  the  text. 

6.  Use  endnotes,  not  footnotes.  The  bibliography  may  be  in  a style 
considered  standard  for  the  discipline  or  professional  field 
represented  by  the  paper. 

7.  Submit  papers  as  email  attachments  to  the  editor. 

8.  Include  the  author’s  name,  affiliation,  and  contact  information  — 
including  postal  address.  Membership  in  an  Academy-affiliated 
society  may  also  be  noted. 

9.  Manuscripts  are  peer  reviewed  and  become  the  property  of  the 
Washington  Academy  of  Sciences. 

10.  There  are  no  page  charges. 

Please  see  the  Academy’s  web  site,  www.washacadsci.org,  for  the  library 
subscription  rate,  listing  of  articles  dating  to  1899,  and  information  on 
accessing  them. 
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Washington  Academy  of  Sciences 
Affiliated  Institutions 

National  Institute  for  Standards  & Technology  (NIST) 

Meadowlark  Botanical  Gardens 

The  John  W.  Kluge  Center  of  the  Library  of  Congress 

Potomac  Overlook  Regional  Park 

Koshland  Science  Museum 

American  Registry  of  Pathology 

Living  Oceans  Foundation 

National  Rural  Electric  Cooperative  Association  (NRECA) 
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Delegates  to  the  Washington  Academy  of  Sciences 
Representing  Affiliated  Scientific  Societies 


Acoustical  Society  of  America,  Washington  Chapter 
American  Association  of  Physics  Teachers,  Chesapeake  Section 
American  Astronomical  Society 
American  Ceramics  Society 
American  Fisheries  Society 

American  Institute  of  Aeronautics  and  Astronautics 
American  Meteorological  Society,  Washington,  DC  Chapter 
American  Nuclear  Society,  Washington  DC  Section 
American  Phytopathological  Society,  Potomac  Division 
American  Society  for  Cybernetics 
American  Society  for  Metals,  Washington  Chapter 
American  Society  of  Civil  Engineers,  National  Capital  Section 
American  Society  of  Mechanical  Engineers,  Washington 
Section 

American  Society  of  Microbiology,  Washington  Branch 
American  Society  of  Plant  Biologists,  Mid-Atlantic 
Anthropological  Society  of  Washington 
ASM  International 

Association  for  Women  in  Science,  DC  Metropolitan  Chapter 
Association  for  Computing  Machinery,  DC  Area  Chapter 
Association  for  Science,  Technology,  and  Innovation 

Association  of  Information  Technology  Professionals 
Biological  Society  of  Washington 
Botanical  Society  of  Washington 
Capital  Area  Food  Protection  Association 
Chemical  Society  of  Washington 
District  of  Columbia  Institute  of  Chemists 
District  of  Columbia  Psychological  Association 
Eastern  Sociological  Society 

Electrochemical  Society,  National  Capital  Section 
Entomological  Society  of  Washington 
Geological  Society  of  Washington 

Historical  Society  of  Washington  DC 
Human  Factors  and  Ergonomics  Society,  Potomac  Chapter 
Institute  of  Electrical  and  Electronics  Engineers,  Northern 
Virginia  Section 


Paul  Arveson 
Frank  R.  Haig,  S.  J. 
Sethanne  Howard 
Vacant 
Lee  Benaka 
David  W.  Brandt 
Vacant 

Charles  Martin 
Vacant 

Stuart  Umpleby 

Vacant 

Vacant 

Daniel  J.  Vavrick 

Vacant 
Mark  Holland 
Vacant 

Toni  Marechaux 
Jodi  Wesemann 
Alan  Ford 
F.  Douglas 
Witherspoon 
Chuck  Lowe 
Stephen  Gardiner 
Chris  Puttock 
Keith  Lempel 
Elise  Ann  Brown 
Vacant 
Tony  Jimenez 
Ronald  W. 
Mandersheid 
Vacant 
Vacant 

Jeffrey  B.  Plescia 
Jurate  Landwehr 
Vacant 

Gerald  P.  Krueger 
Murty  Polavarapu 
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Institute  of  Electrical  and  Electronics  Engineers,  Washington 
Section 

Institute  of  Food  Technologies,  Washington  DC  Section 
Institute  of  Industrial  Engineers,  National  Capital  Chapter 
International  Association  for  Dental  Research,  American 
Section 

International  Society  for  the  Systems  Sciences 
International  Society  of  Automation,  Baltimore  Washington 
Section 

Instrument  Society  of  America 
Marine  Technology  Society 
Maryland  Native  Plant  Society 

Mathematical  Association  of  America,  Maryland-District  of 
Columbia-Virginia  Section 
Medical  Society  of  the  District  of  Columbia 
National  Capital  Area  Skeptics 
National  Capital  Astronomers 
National  Geographic  Society 
Optical  Society  of  America,  National  Capital  Section 
Pest  Science  Society  of  America 
Philosophical  Society  of  Washington 
Society  for  Experimental  Biology  and  Medicine 
Society  of  American  Foresters,  National  Capital  Society 
Society  of  American  Military  Engineers,  Washington  DC  Post 
Society  of  Manufacturing  Engineers,  Washington  DC  Chapter 
Society  of  Mining,  Metallurgy,  and  Exploration,  Inc., 
Washington  DC  Section 

Soil  and  Water  Conservation  Society,  National  Capital  Chapter 
Technology  Transfer  Society,  Washington  Area  Chapter 
Virginia  Native  Plant  Society,  Potowmack  Chapter 
Washington  DC  Chapter  of  the  Institute  for  Operations 

Research  and  the  Management  Sciences  (WINFORMS) 
Washington  Evolutionary  Systems  Society 
Washington  History  of  Science  Club 
Washington  Paint  Technology  Group 
Washington  Society  of  Engineers 
Washington  Society  for  the  History  of  Medicine 
Washington  Statistical  Society 

World  Future  Society,  National  Capital  Region  Chapter 


Richard  Hill 
Vacant 

Neal  F.  Schmeidler 
J.  Terrell  Hoffeld 

Vacant 

Vacant 

Hank  Hegner 
Jake  Sobin 
Vacant 

D.  S. Joseph 

Vacant 

Vacant 

Jay  H.  Miller 

Vacant 

James  Cole 

Vacant 

Eugenie  Mielczarek 

Vacant 

Daina  Apple 

Vacant 

Vacant 

E.  Lee  Bray 

Terrell  Erickson 
Richard  Leshuk 
Vacant 

Russell  Wooten 
Vacant 

Albert  G.  Gluckman 
Vacant 
Alvin  Reiner 
Alain  Touwaide 
Michael  P.  Cohen 
Jim  Honig 
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Editor’s  Comments 

In  the  summer  of  2015  we  celebrate  the  diversity  of  topics 
presented  in  our  Journal.  The  papers  in  this  issue  range  from  stars  to 
rabbits.  Before  I briefly  describe  them,  let  me  say  it  is  an  honor  to  serve 
as  the  new  editor  of  the  Journal  of  the  Washington  Academy  of  Sciences. 
Our  Journal  has  a long  and  distinguished  history.  It  is  almost  unique  in  its 
breadth.  Each  of  you  can  help  continue  this  history  as  we  go  forward. 
Please  celebrate  with  me  and  continue  to  submit  manuscripts  on  all  sorts 
of  topics:  on  the  many  sciences,  on  technical  subjects,  engineering,  and 
mathematics.  Expand  the  field  to  include  the  history,  sociology,  and 
psychology  of  these  subjects.  I welcome  letters  to  the  editor  and  book 
reviews.  This  is  an  exciting  and  challenging  task.  Join  with  me  as  we  go 
forward. 

First  up  we  have  a paper  by  Trevor  Lipscombe  that  reviews  the 
curious  case  of  Schmidt’s  star  (first  mentioned  in  1 891  and  then  relegated 
to  the  history  texts).  Trevor  resurrects  the  star  to  glean  possible  new 
information.  To  follow  it  we  have  a paper  by  Gene  Williams  who 
discusses  fatty  acids  and  cancer.  He  explains  one  of  the  things  fish  oil  is 
likely  doing  for  you  should  you  have  any  cancer  cells  wandering  around 
that  have  not  been  killed  off  by  the  immune  system.  Third  we  have  Kelsey 
Gilcrease  writing  about  the  efforts  of  19th  century  game  wardens  (chasing 
those  rabbits)  in  New  Jersey  and  Massachusetts.  To  complete  this  issue  I 
include  a history  paper  on  how  the  rotational  periods  (the  lengths  of  their 
day)  of  Uranus  and  Neptune  were  determined  before  the  space  mission 
Voyager  traveled  by  them. 

In  the  2007  Spring  issue,  Vol  93,  we  published  an  article  by  Y. 
Said  (then  at  George  Mason  University):  “On  the  Eras  in  the  History  of 
Statistics  and  Data  Analysis”.  We  have  since  retracted  this  article  because 
of  suggested  controversy  over  its  uniqueness. 


Sethanne  Howard 
Editor 


Washington  Academy  of  Sciences 


Ill 


Journal  of  the  Washington  Academy  of  Sciences 


Editor  Sethanne  Howard 


sethanneh@msn.com 


Board  of  Discipline  Editors 

The  Journal  of  the  Washington  Academy  of  Sciences  has  a 12-member 
Board  of  Discipline  Editors  representing  many  scientific  and  technical 
fields.  The  members  of  the  Board  of  Discipline  Editors  are  affiliated  with 
a variety  of  scientific  institutions  in  the  Washington  area  and  beyond  — 
government  agencies  such  as  the  National  Institute  of  Standards  and 
Technology  (NIST);  universities  such  as  Georgetown;  and  professional 
associations  such  as  the  Institute  of  Electrical  and  Electronics  Engineers 
(IEEE). 

Anthropology  Emanuela  Appetiti 

Astronomy  Sethanne  Howard 

Biology/Biophysics  Eugenie  Mielczarek 
Botany  Mark  Holland 

Chemistry  Deana  Jaber 


eappetiti@hotmail.com 

sethanneh@msn.com 

mielczar@physics.gmu.edu 

maholland@salisbury.edu 

diaber@mai~vmount.edu 


Environmental  Natural 

Sciences  Terrell  Erickson 

Health  Robin  Stombler 

History  of  Medicine  Alain  Touwaide 

Operations  Research  Michael  Katehakis 

Physics  Katharine  Gebbie 

Science  Education  Jim  Egenrieder 

Systems  Science  Elizabeth  Corona 


terrell.ericksonl@wdc.nsda.gov 

rstombler@aubumstrat.com 

atouwaide@hotmail.com 

mnk@rci.rutgers.edu 

katharine.gebbie@nist.gov 

i im@deepwater.org 
elizabethcorona@gmail.com 


Summer  2015 


Washington  Academy  of  Sciences 


1 


The  Curious  Case  of  Schmidt’s  Star 

Trevor  Lipscombe 

Catholic  University  of  America  Press,  Washington,  DC. 

Abstract 

This  article  discusses  the  internal  structure  of  a type  of  star  first 
proposed  by  August  Schmidt  in  1891,  one  that  causes  any  light  to  enter 
it  to  move  in  a circle.  An  exact  analytical  solution  of  the  equation  of 
hydrostatic  equilibrium  is  thus  obtained.  The  solution  is  physically 
realistic,  in  the  sense  that  the  central  density,  central  pressure,  and  total 
mass  are  all  finite,  while  both  density  and  pressure  drop  to  zero  at  the 
outer  radius  of  the  star.  In  the  core  of  the  star,  the  pressure  depends  only 
weakly  on  density.  The  outer  layers  of  the  star  can  be  well- 
approximated  as  isothermal.  Schmidt’s  star,  then,  is  a physical  system 
of  historical,  pedagogical,  and  mathematical  interest. 

Introduction 

In  the  late  1800s  astrophysicists  faced  a conundrum.  The  age  of  the 
Earth  had  been  reliably  determined  and  thus  the  minimum  age  of  the  Sun 
was  also  known;  but  given  that  age,  and  the  known  laws  of  physics,  stars 
should  have  burnt  out  long  before.1  We  now  know  that  the  stars  shine 
because  of  nuclear  processes  that  take  place  in  their  core,  processes 
unknown  in  the  nineteenth  century. 

In  1891  a German  scientist,  August  Schmidt  (1840-1929), 
proposed  a radical  solution  to  resolve  the  paradox.  What  if  stars  didn't 
shine  but  were,  in  a sense,  mirages?  That  is  to  say,  suppose  stars  acted  as 
giant  lenses,  with  a refractive  index  that  varies  as  a function  of  radius  and 
that  causes  any  ray  of  light  to  enter  it  to  move  with  a circular  motion, 
thereby  never  leaving  the  star?  This  would  remove  any  need  for  a 
mechanism  by  which  stars  had  to  bum  fuel  to  generate  energy,  and  so 
resolve  the  conundrum. 

Schmidt  proposed  a model  in  which  the  Sun’s  outer  surface  was 
such  an  optical  illusion2.  This  idea  caught  the  attention  of  “a  student  of 
astronomy”  E.J.  Wilczynski,  who  wrote  the  first  English-language  article 
on  Schmidt’s  theory  in  1895.  It  appeared  in  the  first-ever  volume  of  the 
Astrophysical  Journal3  (ApJ),  followed  only  a few  pages  later  by  a note 
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from  James  Edward  Keeler,  the  co-founder  and  co-editor  of  the  ApJ,  who 
commented: 

“...The  theory  is  apt  to  be  more  favorably  regarded  by  mathematicians 
than  by  observers' ,4 

a sentiment  echoed  by  George  Ellery  Hale—  the  other  co-founder  and  co- 
editor of  the  ApJ  — who,  in  the  very  next  volume,  wrote  : 

“As  a theoretical  discussion  the  theory  is  interesting  and  valuable,  but  few 
observers  of  the  Sun  wall  consider  it  capable  of  accounting  for  the  varying 
phenomena  encountered  in  their  investigations"5 . 

Here,  though,  we  follow  Michael  Faraday’s  dictum  that  “Nothing 
is  too  wonderful  to  be  true,  if  it  be  consistent  with  the  laws  of  nature"  and 
investigate  Schmidt’s  theory,  to  see  whether  such  an  astrophysical  object 
could  actually  exist.  Heretofore,  studies  have  only  paid  attention  to  the 
optical  properties  of  Schmidt’s  star.  (The  exact  nature  of  the  outer  visible 
layer  of  the  Sun,  which  was  in  large  part  what  Schmidt  dealt  with,  still 
generates  controversy.6) 

In  this  article,  we  determine  the  physical  properties  of  Schmidt’s 
star,  which  appears  not  to  have  been  done  before.  The  basic  mechanism 
proposed  by  Schmidt  leads  to  an  exact  solution  for  the  equation  of 
hydrostatic  equilibrium,  which  governs  self-gravitating  stationary  spheres 
of  fluids. 


Density  Determination 

In  Waves  and  Grains , Mark  Silverman  analyzes  the  optical 
properties  of  Schmidt’s  star7.  If  a spherical  lens  has  a refractive  index  n(r) 
and  possesses  spherical  symmetry,  then  Silverman  shows  that  a point  on 
a light  ray  r is  given  by: 


dr 

He 


(1) 


where  R is  the  radius  of  the  star.  For  the  light  rays  to  move  in  a circular 
path,  we  require  that  r = constant,  and  the  refractive  index  in  the  medium 
must  therefore  vary  as: 

n{r  ) = —■  (2) 

r 
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Note  that  when  r — R we  have  n - 1,  which  is  the  refractive  index  of  the 
vacuum. 


The  refractive  index  in  a medium  depends,  among  other  things,  on 
the  density  of  that  medium,  which  is  the  basic  mechanism  behind  the 
formation  of  mirages.  The  Clausius-Mosotti  (or  Lorentz-Lorenz)  relation 
can  be  used  to  relate  the  refractive  index  of  a substance  to  its  density8: 


rr  - 1 
n2+  2 


Kp 


(3) 


where  K is  a constant  that  depends  on  the  particular  gas  of  which  the  star 
consists.  Studies  of  the  Lorentz-Lorenz  relation  for  gaseous  and  liquid 
hydrogen  show  that  K remains  approximately  constant  (-1.03  cm”3  / g) 

for  a broad  range  of  temperatures  (15-298  Kelvin)  and  pressures  (9-200 
atm)9,  though  it  may  well  not  hold  at  the  high  densities  typically  found  at 
the  centers  of  stars.  Substituting  in  from  Eq.  (3)  above: 


R2-r 2 
R2+2r2 


- Kp. 


(4) 


When  r = R,  the  density  falls  to  zero,  as  it  should  at  the  outer  radius  of 
the  star.  Note  also  that  when  r = 0,  Kp{ 0)  = 1,  so  that  K is  the  inverse  of 
the  central  density  pc.  Thus,  using  the  scaled  dimensionless  radius 
x = r / R,  we  can  write: 


P = Pc 


^1-x2^ 


1 +2x" 


(5) 


As  a consequence  of  knowing  how  the  density  varies  within  this 
astrophysical  object,  we  can  calculate  its  mass,  M: 


i 

M = 47rpcR2\ 


x 2 (l-x2 ) 


1 + 2x" 


<Ix 


(6) 


which  integrates  to: 


M - 4np(  R 


12  8 


(7) 


and  hence  numerically: 
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M ~ 4k (0.07 67)  p(  R3 . 

If  the  average  density  is  /?,  then  by  definition: 


(8) 


4 ttR3  _ 
M - — - — p. 


(9) 


which  means  that  the  average  density  of  Schmidt’s  star  is  related  to  the 
central  density  by: 


p = 0.2301/?, 


c- 


(10) 


Pressure  Determination 

The  equation  of  hydrostatic  equilibrium  of  a static,  self-gravitating 
sphere  of  fluid  of  density  p(r)  and  pressure  P(r)  is10: 


1 d 


r2  dr 


' r_dp 

yP  dr  , 


= 4 7iGp{r) 


(11) 


or,  in  terms  of  the  dimensionless  radius, 


1 d 


x2  dx 


v rf/y 

v p dx 


= -4/rGR2  p(x). 


(12) 


Thus,  by  means  of  Eq.  (5), 

Vdfp 

v p dx 


d 

dx 


= -AkGR2  pcx~ 


f 


1 -x 


2 \ 


1 + 2x2 


(13) 


Integrating: 


— — ^-  = -4 7rGR~ p — -4x3  +18.X-9V2  tan  1 V2x  + const.  (14) 
p(x)  dx  24  L 3 

Given  that  the  left-hand  side  of  the  equation  is  zero  at  the  center  of  the 
star,  the  constant  must  be  zero.  Hence  we  have: 


dP  7rGR2p2 


dx 


4x + 


18  9V2 


tan 


V2; 


M-x2  ' 


1 + 2x 


(15) 


This  can  be  integrated  term-by-term  analytically  once  again  to  give: 
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m 


_ tKjR2pI 


j-2x2  +31n(2x2  + l)-l}-18*j-— — h— + lnx 


4 2 


>+- 


9V2 


{Vlx 


x2 -51n(2x2 +l)  + 61nx  + -4x4+6x2-6 
Eq.  (16)  simplifies  to: 


tan 


6x 

+ const. 


(16) 


1 7 2 9 4 27 

7x  — x 

2 2 2 

+ 9V2 


ln(2x2  +l) 


f n 

x 

tan  1 V2x 

V x) 

(17) 


+const. 


At  the  outer  radius  of  the  star,  the  pressure  drops  to  zero  and  thus  P(x  = 1 ) 
= 0.  This  boundary  condition  allows  for  the  calculation  of  the  integration 
constant,  since  we  must  have: 


P{x  = 1)  = 0 = 


KGRjpl 


1 9 27 

7-- In  3 

2 2 2 


+ const. 


(18) 


Hence: 


7cGR2  p2 
const  = — 


1 9 27,  0 

— + 7 H 1 In  3 

2 2 2 


(19) 


The  complete  solution  for  the  pressure  distribution  within  Schmidt's  star 
is  thus: 


. 7rGR  2 p2. 
P(x)  = 


^23 + 27  In  3^  „ , 9 4 

- lx"  — x 
2 


J 


27 


f 1 A 

In ^2x2 +l) + 9v2  x — tan”1  V2x 

v X ) 


(20) 


Note  that: 


f 


lim 

x— >0 


tan  1 >[2x 


x 


(21) 


so  that  the  central  pressure  is  indeed  finite  and  has  the  value: 
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p _ ”GR2Pc 

rc  ~ 


^ 27  In  3 — 1 3 ^ 

v 2 , 


(22) 


Hence: 


Pc 

m=- 


^23  + 27  ln3N 

2 , 


|x4-yta(2xJ+l)+9\/2 
7 271n3— 13V 

l 2 J 


(23) 


A graph  of  the  pressure  and  density  within  the  star,  as  a function  of  stellar 
radius,  is  shown  in  Figure  1. 


Fig.l  Density  and  Pressure  as  a Function  of  Radius 


Density  Pressure  from  Eq. (20) 


The  pressure  and  density  of  the  star  are  such  that  a good 
approximation  for  their  relation  is1 1 : 


P = \.0\\\2PC 


1 - exp 


^-4,1  A) 

V Pc  y 


(24) 


The  exact  relation  of  pressure  with  density  [from  Eqs.  (5)  and  (20)]  is 
compared  with  the  approximate  relationship  of  Eq.  (24)  in  Figure  2. 
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Hence,  by  starting  with  a simple  requirement — light  rays  travel  in 
circles  within  the  astrophysical  object — we  can  determine  that  such  a star 
has  a finite  central  density  and  pressure  and  determine  the  variation  of  the 
pressure  and  density  as  a function  of  the  radius. 

Comparison  with  Poiytropic  Models 

The  usual  approach  to  stellar  astrophysics  is  to  explore  the 
polytrope  equation12.  That  is  to  say,  one  seeks  solutions  of  the  form: 

P = KpMln  (25) 

in  the  equation  for  hydrostatic  equilibrium.  This  model  generates  the 
Lane-Emden  equation,  named  after  Jonathan  Homer-Tane,  an 
astrophysicist  who  spent  many  years  in  Washington  DC,  and  Swiss 
astrophysicist  Robert  Emden.  Here  n is  not  the  refractive  index,  but  the 
so-called  polytropic  index  of  the  star. 

It  is  complicated  to  compare  Schmidt’s  star  with  standard 
polytrope  solutions.  For  example,  as  seen  from  Eq.  (10),  the  central 
density  is  related  to  the  average  density  by: 

pc  = 4.348/?  (26) 
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which,  by  use  of  the  Polytrope  Tool13,  is  equivalent  to  the  central  pressure 
of  a polytrope  whose  index  is  n — 1.31.  The  standard  model  for  the  Sun 
(the  Eddington  model,  for  which  n = 3)  has  the  value  pc  =54.18 p,  just 

over  a factor  of  ten  larger14. 

The  central  pressure  in  Schmidt’s  star  is  given  by  numerically 
evaluating  Eq.  (22): 

P..  =8.331 26 P'  . (27) 

6 

The  mass,  though,  is  given  in  Eq.  (8),  as: 

M ~ 4xpcRi  (0.0767)  (28) 


and  so  by  substituting  in  for  the  central  density,  we  obtain: 

8.33126;rGfl2  ^ w v 


P, 


c 


M 


4/rR3  (0.0767) 


(29) 


or: 


R.  * 4.7 


c 


GM 2 

R4 


(30) 


Again,  by  means  of  the  Polytrope  Tool,  this  is  an  expression  equivalent  to 
the  central  pressure  of  a polytrope  of  index  n = 2.595,  almost  double  the 
index  obtained  from  consideration  of  the  central  density.  The  Eddington 
model  for  the  Sun  has  the  numerical  factor  1 1 .05  rather  than  4.7,  a central 
pressure  some  2.35  times  higher  than  the  Schmidt  star.1'' 


For  further  comparison,  note  that  for  a polytrope: 


P_ 

T 


(jl| 

V Pc  J 


1+1/ n 


Consequently: 


P_p£  = fP_\n 
Pc  P 'Pc' 


(31) 


(32) 
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We  can  thereby  define  an  effective  pointwise  polytropic  index  n by: 


n = 


(33) 


A curve  of  n as  a function  of  radius  is  shown  in  Figure  3. 


Fig.  3 Effective  polytropic  index  as  function  of  radius 


Index,  n 


The  best  fits  have  n = —1.22  from  x = 0 to  0.5,  half  way  out 
through  the  star.  This  is  a negative  polytrope  of  varying  index. 
Approaching  x = 1,  the  index  becomes  large  and  negative,  so  that 
P ~ const  p , which  is  the  equation  describing  an  isothermal  outer  layer 
(infinite  polytropic  index)  to  Schmidt’s  star. 

Discussion 

Negative-index  polytropes  were  first  discussed  by  Eddington  in 
1931 16.  In  the  same  paper,  he  used  the  phrase  “Incomplete  poly  tropes”  to 
describe  a structure  similar  to  Schmidt’s  star,  wherein  the  inner  core  might 
best  be  modeled  by  one  value  of  the  polytropic  index  and  the  outer  layers 
by  another.  Viala  and  Horedt17  showed  that  astrophysical  objects  with 
negative  polytropic  indexes  are  good  models  for,  among  other  things, 
interstellar  clouds.  In  addition,  they  showed  that  sufficiently  negative 
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indexes  ( n < 1),  can  be  stable.  Chaplygin  gases,  which  have  negative 
indexes,  are  currently  of  great  interest  in  cosmology,  as  they  are 
candidates  for  dark  matter  and  can  form  stable  gravitational  structures 
(both  in  classical  Newtonian  and  general  relativistic  gravity)18. 

In  Schmidt's  star,  both  the  pressure  and  the  density  decrease  when 
moving  radially  outwards.  In  the  outer  layers,  the  pressure  varies  almost 
linearly  with  density,  which  suggest  an  isothermal  envelope  for  the  star. 
However,  in  the  stellar  core  it  is  similar  to  a negative-index  polytrope  of 
index  n < H,  in  that  the  temperature  must  increase  radially  outwards.  This 
creates  a significant  problem  for  Schmidt's  star,  in  that  to  be  physically 
realistic,  the  model  must  represent  an  astrophysical  object  whose  core  is 
being  heated  externally,  either  by  particles  or  by  radiation,  in  a spherically 
symmetric  manner,  but  whose  outer  layers  are  isothermal. 

Keeler  and  Hale's  original  criticisms  of  Schmidt’s  proposal  were 
that  it  was  of  importance  only  mathematically.  Regrettably  that  may 
indeed  be  the  case.  Schmidt’s  star,  though,  remains  of  interest.  Such 
interest  is  not  just  historic;  Schmidt’s  star  also  is  of  pedagogical  value19. 
Undergraduate  physics  students,  as  an  exercise  in  physical  modeling, 
could  be  presented  with  the  Schmidt-Silverman  equation  for  the  refractive 
index  of  Schmidt’s  star  and  then  asked  to  solve  for  the  pressure  and 
density  of  this  object.  This  requires  knowledge  of  various  disciplines 
within  physics.  They  could  also  be  asked  to  comment  on  whether  such  an 
object  could  indeed  exist,  which  requires  them  to  recognize  that  the 
temperature  in  an  astrophysical  object  should,  to  be  realistic,  fall  off  with 
increasing  radius. 

Conclusions 

In  this  paper,  we  have  explored  the  structure  of  an  astrophysical 
object  whose  physics  has  not  previously  been  determined  completely.  By 
requiring  such  a star  to  act  as  a graded  refractive  index  lens  that  causes  all 
light  entering  in  to  it  to  move  in  a circular  path,  we  have  been  able  to 
determine  the  density  of  the  star  and  its  pressure.  Such  a star  has  a 
pointwise  negative  polytropic  index,  but  its  pressure  and  density  both 
decrease  as  the  radius  increases,  and  the  value  of  the  index  is  such  that 
Schmidt’s  star  is  likely  to  be  stable.  While  the  study  was  motivated  by 
Schmidt’s  suggestion  in  1891,  the  density  profile  and  pressure  profile  here 
represent  an  exact  solution  of  the  equation  of  hydrostatic  equilibrium. 
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whether  one  gives  credence  to  Schmidt’s  belief  or  not.  This  is  one  of  the 
few  physically  motivated  stellar  models,  other  than  polytropes,  that  is  not 
singular  at  the  origin  nor  infinite  in  extent.  While  the  temperature  profile 
makes  Schmidt's  star  likely  to  be  physically  unrealistic,  it  remains  of 
historical,  mathematical,  and  pedagogical  value. 

This  paper  is  dedicated  to  Kelsey  Schmidt  and  Tom  LaCour  on  the 
occasion  of  their  marriage. 


REFERENCES 


1 Frank  D.  Stacey  ‘Kelvin’s  Age  of  the  Earth  paradox  revisited,”  Journal  of 

Geophysical  Research  105(B6),  pp  13155-13158  (2000). 

2 August  Schmidt  “Die  Strahlenbrechung  auf  der  Sonne:  ein  geometrisches  Beitrag  zur 

Sonnenphysik”  (Stuttgart:  Metzlerscher  Verlag,  1891). 

3 Ernest  J Wilczynski  “Schmidt’s  Theory  of  the  Sun”,  Astrophysical  Journal  vol.  1,  pp 

112-126  (1895). 

4 James  E Keeler  “Schmidt’s  Theory  of  the  Sun”,  Astrophysical  Journal  vol.  1,  pp  178- 

179  (1895). 

5 George  E Hale  “Notes  on  Schmidt’s  Theory  of  the  Sun”,  Astrophysical  Journal  vol. 

2,  pp  69-74  (1895). 

6 See,  for  example,  Pierre-Marie  Robitaille  “Commentary  on  the  Radius  of  the  Sun: 

Optical  Illusion  or  Manifestation  of  a Real  Surface”,  Progress  in  Physics  Vol.  2, 
L5-L6  (2013). 

7 Mark  P.  Silverman  Waves  and  Grains  (Princeton  N.J.:  Princeton  University  Press, 

1998),  pp  27-30. 

8 See,  for  example,  Charles  Kittel  Introduction  to  Solid  State  Physics  (8,h  edition)  (New 

York:  John  Wiley  & Sons,  1990). 

9 Dwain  E.  Diller  “Refractive  Index  of  Gaseous  and  Liquid  Hydrogen”  J.  Chem  Phys. 

49(7)  3096-3105  (1968). 

10  See,  for  example,  S.  Chandrasekhar  An  Introduction  to  the  Study  of  Stellar  Structure 

(Chicago:  University  of  Chicago  Press,  1939),  page  63,  equation  (6). 

11  By  inspection,  a trial  solution  is  P — const[  1 — exp(— ap)].  The  requirement  that 

P = 1 when  p ~ 1 determines  the  constant  in  terms  of  a.  At  low  densities,  we 
have  P = [a/[  1 — exp(— a)]p  and  so  use  of  the  data  at  low  densities  in  a linear 
regression  estimator  (such  as  LINEST  in  Microsoft  Excel)  leads  to  the  best 
numerical  fit. 

12  S.  Chandrasekhar  An  Introduction  to  the  Study  of  Stellar  Structure  (Chicago: 

University  of  Chicago  Press,  1 939),  pp.  84-182. 


Summer  2015 


12 


L'  http://www.webnucleo.Org/home/online_tools/polytrope/0.8/ 

14  S.  Chandrasekhar  An  Introduction  to  the  Study  of  Stellar  Structure  (Chicago: 

University  of  Chicago  Press,  1939),  equation  56,  chapter  6,  page  230. 

15  S.  Chandrasekhar  An  Introduction  to  the  Study  of  Stellar  Structure  (Chicago: 

University  of  Chicago  Press,  1939),  chapter  6,  page  230,  equation  57. 

16  Arthur  S.  Eddington  “A  Theorem  Concerning  Incomplete  Polytropes”  Mon.  Not. 

Roy.  Ast.  Soc.  91  pp  440-444  (1931). 

17  Yves  P.  Viala  and  Georg  P.  Horedt  “Polytropic  Sheets,  Cylinders,  and  Spheres  with 

Negative  Index”.  Astron.  & Astrophys.  33  pp  195-202  (1974). 

18  *i 

Trevor  C.  Lipscombe  “Self-gravitating  clouds  of  generalized  Chaplygin  and  anti- 
Chaplygin  gases,”  Physica  Scripta  83(3)  ID  = 035901  (201 1). 

19  Another  example  of  historically  motivated  physics  of  pedagogical  value  might  be  the 

modeling  of  Newton’ s-bucket  experiment  in  Carl  E.  Mungan  and  Trevor  C. 
Lipscombe  “Newton’s  Rotating  Water  Bucket:  A Simple  Model,”  Journal  of  the 
Washington  Academy  of  Sciences  99(2),  pp  15-24  (2013). 


Bio 

Trevor  Lipscombe  is  the  director  of  the  Catholic  University  of  America 
Press.  He  holds  a doctorate  in  theoretical  physics  from  Oxford,  is  a Fellow 
of  the  Royal  Astronomical  Society,  and  tries  to  do  theoretical  physics  in 
his  spare  time.  He  is  the  author  of  “The  Physics  of  Rugby’'  (Nottingham 
University  Press,  2009);  coauthor,  with  Alice  Calaprice,  of  “Albert 
Einstein:  A Biography”  (Greenwood,  2005);  and  editor  of  a critical 
edition  of  Blessed  John  Henry  Newman's  novel  “Loss  and  Gain:  The 
Story  of  a Convert”  (Ignatius  Press,  2012). 


Washington  Academy  of  Sciences 


13 


Docosahexaenoic  Acid  Induces  Death  in  Murine 
Leukemia  Cells  by  Activating  the  Extrinsic  Pathway 

of  Apoptosis. 

E.  Eugene  Williams 

Salisbury  University 

Abstract 

Docosahexaenoic  acid  (DHA)  is  a unique  fatty  acid  that  is  found 
predominantly  in  the  phospholipids  of  cell  membranes.  It  has  wide- 
ranging  therapeutic  effects  that  are  broadly  appreciated  but  poorly 
understood.  Its  principal  location  in  the  membranes  of  cells  suggests  that 
these  myriad  effects  are  manifest  there.  When  cultured  in  DHA-enriched 
medium,  cells  of  the  murine  leukemia  cell  line  T27A  took  up  the  fatty 
acid  and  incorporated  it  into  cellular  phospholipids,  particularly  those  of 
the  plasma  membrane.  Culture  in  DHA-enriched  media  also  caused 
significant  dose-dependent  cell  death  accompanied  by  increased  plasma 
membrane  bleb  formation.  Cysteine-dependent  aspartate-directed 
proteases  (caspases)-3  -8  and  -9  were  also  activated,  establishing 
apoptosis  as  the  mechanism  of  DHA-induced  cell  death.  Inhibition  of  any 
one  of  these  caspases  rescued  the  cells  from  apoptotic  death.  Caspase 
inhibition  experiments  identified  T27A  cells  as  belonging  to  the  type  II 
group  of  apoptotic  cells  and  showed  that  apoptosis  was  initiated  via  the 
extrinsic  pathway.  Together  these  and  previous  data  support  the 
hypothesis  that  DHA  causes  cell  death  in  leukemic  cells  by  inducing 
alterations  in  the  structure  of  lipid  rafts  that  lead  to  the  ligand-independent 
activation  of  death  receptors  and  apoptosis. 

Introduction 

Docosahexaenoic  acid  (DHA,  22:6n-3)  is  a unique  fatty  acid  that  is 
found  in  the  cells  of  a wide  range  of  organisms  from  bacteria  to  humans. 
It  is  the  longest  and  most  unsaturated  of  the  commonly  occurring  n-3 
(omega-3,  co-3)  fatty  acids  (Salem  et  al.  1986).  DHA  has  diverse 
therapeutic  properties  that  are  acclaimed  in  both  the  scientific  and  lay 
communities  (Stillwell  and  Wassail  2003;  Siddiqui  et  al.  2004;  Chapkin 
et  al.  2009).  A remarkable  number  of  conditions  and  diseases  have  been 
demonstrated  to  be  prevented,  mitigated,  counteracted  or  improved  by 
DHA.  These  include  maladies  as  disparate  as  cancer,  heart  disease,  cystic 
fibrosis,  diabetes,  immune  function  and  even  psychiatric  disorders 
(Stillwell  and  Wassail  2003;  Siddiqui  et  al.  2004;  Calder  2012; 
Mischoulon  and  Freeman  2013).  While  the  relationship  between  DHA 


Summer  2015 


14 


and  improved  health  is  widely  appreciated,  the  basic  molecular 
mechanism  underlying  this  relationship  remains  unclear.  As  noted  by 
Stillwell  (2008),  the  assortment  of  seemingly  unrelated  biochemical  and 
physiological  processes  underlying  the  diseases  and  conditions  that  are 
influenced  by  DHA  suggests  that  this  fatty  acid  influences  a fundamental 
cellular  function  or  property. 

DHA  has  been  shown  to  have  powerful  anti-cancer  effects  in 
animals  and  cultured  tumor  cells  (Siddiqui  et  al.  2004).  For  example,  it  is 
effective  at  reducing  the  accumulation  of  leukemic  cells  in  vitro  and  in 
slowing  the  rate  of  progression  of  leukemia  in  animals  ( e.g . Jenski  et  al. 
1993;  Jenski  et  al.  1995;  Zerouga  et  al.  1996).  DHA  has  been  shown  to 
induce  cell  death  in  human  and  mouse  leukemia  cells  in  a dose  dependent 
manner  (Kafrawy  et  al.  1998;  Yamagami  et  al.  2009)  and  it  has  been 
suggested  the  anti-leukemia  properties  of  DHA  are  in  general  founded  on 
the  ability  of  DHA  to  induce  cell  death  in  tumor  cells  (Serini  et  al.  2009). 

Despite  continuing  efforts,  it  is  currently  unclear  precisely  how 
DHA  triggers  cell  death.  DHA  can  be  converted  into  reactive  oxygen 
species  that  can  influence  cell  survival  (Siddiqui  et  al.  2008),  and  into 
powerful  anti-inflammatory  and  pro-resolving  mediators  (resolvins, 
protectins  and  maresins)  that  can  influence  cell  survival  and  disease 
etiology  (Serhan  et  al.  2014;  Colas  et  al.  2014;  Dalli  et  al.  2015).  DHA 
can  also  affect  gene  expression  (Berger  et  al.  2006),  the  acylation  patterns 
of  membrane  proteins  (Webb  et  al.  2000),  and  the  function  of  enzymes 
and  ion  channels  (Matta  et  al.  2007).  However  a large  and  growing  body 
of  evidence  indicates  that  DHA  induces  cell  death  only  after  it  has  become 
incorporated  into  membrane  phospholipids  and  that  the  initial  triggering 
event  in  cell  death  is  a membrane-based  phenomenon  (Stillwell  and 
Wassail  2003;  Stillwell  et  al.  2005;  Calder  2012). 

There  is  substantial  physiological,  biochemical,  biophysical,  and 
morphological  evidence  that  DHA-containing  phospholipids  change  the 
structure  of  cell  membranes  (Mitchell  et  al.  2003;  Niu  and  Mitchell  2005; 
Chapkin  et  al.  2008;  Shaikh  2010;  Rockett  et  al.  2012;  Teague  et  al.  2013; 
Pinot  et  al.  2014).  Indeed,  whether  provided  as  a dietary  component  to  an 
individual  organism  (Lien  2009)  or  as  a component  of  the  incubation 
medium  of  cultured  cells  (Zerouga  et  al.  1996;  Williams  et  al.  1998; 
Williams  et  al.  1999),  DHA  is  taken  up  by  cells  and  incorporated  into  the 


Washington  Academy  of  Sciences 


15 


phospholipids  of  membranes.  The  plasma  membrane  in  particular  appears 
to  be  a primary  location  of  action  for  the  tumor  cell  killing  properties  of 
DHA  (Jenski  et  al.  1993;  Pascale  et  al.  1993;  Williams  et  al.  1998; 
Williams  et  al.  1999).  Of  particular  interest  in  this  regard  is  the  influence 
of  DHA-containing  phospholipids  on  the  membrane  microdomain 
structures  known  as  lipid  rafts.  Lipid  rafts  serve  as  platforms  for  the 
regulation  of  cell  processes  and  represent  a selective  cellular  compartment 
that  can  co-localize  and  modulate  the  activities  of  enzymes,  receptors  and 
other  proteins  (Simons  and  Ikonen  1997;  Lingwood  and  Simons  2010). 
There  is  evidence  that  DHA-containing  phospholipids  induce  cell  death 
by  altering  the  structure  or  organization  of  lipid  rafts,  and  that  this 
influence  on  membrane  structure  is  the  first  and  most  important  step  in 
DHA-induced  cell  death  (Stillwell  et  al.  2005;  Schley  et  al.  2007;  Chapkin 
et  al.  2008). 

Other  evidence  strongly  suggests  that  DHA  causes  cell  death  in 
tumor  cells  by  the  induction  of  apoptosis  (Blanckaert  et  al.  2010;  Kang  et 
al.  2010).  There  are  two  distinct  activation  pathways  for  apoptosis.  The 
extrinsic  pathway  involves  plasma  membrane-associated  death  receptors 
and  a cysteine-dependent  aspartate-directed  protease,  caspase-8.  The 
intrinsic  pathway  involves  the  release  of  cytochrome  c from  mitochondria 
and  the  activation  of  caspase-9.  These  two  initiating  events  then  cause  the 
activation  of  downstream  effector  caspases  including  caspase-3  which  in 
turn  cleaves  a series  of  intercellular  substrates  to  continue  the  apoptotic 
cascade.  Lipid  rafts  are  importantly  involved  in  the  extrinsic  apoptotic 
pathway  as  the  death  receptors,  a subset  of  the  tumor  necrosis  factor 
receptor  superfamily,  are  among  those  receptors  regulated  by  lipid  rafts 
(Gajate  et  al.  2009;  Lang  et  al.  2012). 

Thus,  there  is  evidence  that  DHA  causes  the  death  of  many  types 
of  tumor  cells,  that  the  cause  of  cell  death  in  many  of  these  instances  is 
the  induction  of  apoptosis,  that  DHA  alters  the  structure  of  lipid  rafts,  and 
that  lipid  rafts  regulate  the  receptors  involved  in  initiating  the  extrinsic 
pathway  of  apoptosis.  This  study  attempts  to  connect  these  links  by  testing 
the  hypothesis  that  DHA  causes  cell  death  in  leukemia  cells  by 
specifically  triggering  the  extrinsic  pathway  of  apoptosis.  We  show  that 
DHA  is  selectively  incorporated  into  the  plasma  membrane  of  murine 
leukemia  (T27A)  cells.  We  use  both  morphological  and  biochemical 
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means  to  demonstrate  that  DHA  induces  apoptosis  in  these  cells.  By 
monitoring  and  manipulating  the  activities  of  caspases  -8,  -9  and  -3,  we 
further  show  that  all  three  caspases  are  activated  by  DHA  and  that  the 
inhibition  of  any  one  of  them  rescues  T27A  cells  from  DHA-induced 
apoptosis.  Together  these  and  previous  data  support  the  hypothesis  that 
DHA  causes  cell  death  by  inducing  alterations  in  the  structure  of  lipid  rafts 
that  lead  to  the  ligand-independent  activation  of  death  receptors  and 
apoptosis. 


Methods 

Materials 

T27A  murine  leukemia  cells  were  obtained  from  American  Type 
Culture  Collection  (Manassas,  Va).  Fatty  acids  and  fatty  acid  methyl  ester 
(FAME)  reference  standards  were  purchased  from  Nu-Chek-Prep 
(Elysian,  MN).  RPMI-1640  culture  medium  supplemented  with  2 mM 
glutamine,  25  mM  HEPES,  50  pg/mL  streptomycin  and  100  units/mL 
penicillin,  was  from  Cambrex  Bio  Science  (Walkersville,  MD).  Bovine 
calf  serum  was  from  Hyclone  (Logan,  UT).  Irreversible,  cell-permeable 
inhibitors  of  caspases  -3  (Z-D[0-Me]E[0-Me]VD[0-Me]-FMK),  -8  (Z- 
IE[0-Me]TD[0-Me]-FMK),  and  -9  (Z-LE[0-Me]HD[0-Me]-FMK) 
were  from  Calbiochem  (EMD  Biosciences,  Inc.,  La  Jolla,  CA).  The 
colorimetric  assay  kits  for  measuring  the  activities  of  caspases  -3,  -8  and 
-9  were  from  BioVision  (Mountain  View,  CA.).  Staurosporin,  SiCE 
(“Celite”),  and  dimethyl  sulfoxide  (DMSO)  were  from  Sigma  Chemical 
Co.  (St.  Louis,  MO).  All  other  chemicals  were  from  Sigma  or  Thermo 
Fisher  Scientific  (Waltham,  MA). 

Cell  culture 

Except  where  noted,  T27A  cells  were  cultured  in  RPMI-1640 
medium  supplemented  as  described  above  and  with  10%  (vol/vol)  bovine 
calf  serum  in  25  cm2  culture  flasks  maintained  at  37°C  under  an 
atmosphere  of  5%  CO2  in  humidified  air.  As  noted  previously  (Zerouga 
et  al.  1996;  Williams  et  al.  1998;  Williams  et  al.  1999),  under  these 
conditions  cultures  doubled  every  12  to  15  hours.  Cell  viability  was 
monitored  by  trypan  blue  exclusion  (0.04%  in  phosphate  buffered  saline 
[PBS,  0.154  M NaCl,  0.016  M NaH2P04,  pH  7.2]). 
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Supplementation  of  culture  media  with  fatty  acids 

DHA  and  oleic  acid  (OA,  18:1  n-9)  were  added  to  RPMI  culture 
medium  using  the  methods  of  Spector  and  Hoak  (1969)  exactly  as 
described  by  Williams  et  al.  (1998).  The  fatty  acid  was  dissolved  in 
hexane  and  transferred  to  an  Erlenmeyer  flask  containing  SiCU  Ten  g of 
SiCb  were  used  per  mmol  of  fatty  acid.  The  hexane  was  removed 
completely  by  a gentle  stream  of  N2  before  the  dry  mixture  was  transferred 
to  a solution  of  fatty  acid  free  bovine  serum  albumin  (1%  fatty  acid  free 
BSA  in  RPMI  supplemented  as  above,  but  excluding  serum).  After 
stirring  for  30  min  in  the  dark,  the  RPMI/fatty  acid  mixture  was 
centrifuged  for  30  min  at  600  grav  to  remove  the  Si02  and  the  medium 
was  sterilized  by  filtration  (0.22  pm).  Bovine  calf  serum  was  added  to 
10%  (vol/vol)  of  the  total  just  before  use.  Calf  serum  contributes  a small 
amount  of  fatty  acids  to  the  final  culture  medium,  but  less  than  1 % of  that 
is  DHA  (Williams  et  al.  1998).  Unless  noted  otherwise,  cells  were 
incubated  in  fatty  acid-enriched  medium  for  3 days  (68-76  hours).  Under 
these  conditions  T27A  cells  take  up  considerable  DHA,  and  at  DHA 
concentrations  below  0.61  mM  they  remain  >90%  viable  (Williams  et  al. 
1998;  Williams  et  al.  1999;  and  see  below). 

Assay  of  caspase  activity  and  caspase  inhibition 

The  activities  of  caspases  -3,  -8,  and  -9  were  measured 
spectrophotometrically  in  90-well  plates.  For  each  assay,  T27A  cells  were 
cultured  in  RPMI  medium  containing  no  additions,  1.3  pM  staurosporin, 
or  0.61  mM  DHA.  After  16  h of  culture,  cells  from  each  flask  were 
harvested  by  low-speed  centrifugation.  Cell  viability  (always  greater  than 
90%  in  control  cells)  was  assessed  by  trypan  blue  exclusion  and  cell 
density  was  determined  by  duplicate  counts  on  a hemacytometer.  For  each 
treatment,  3 x 106  cells  were  treated  with  50  pF  of  lysis  buffer  according 
to  the  manufacturer’s  instructions.  After  centrifugation,  30  pF  of  cell 
lysate  were  mixed  with  20  pF  of  caspase  assay  medium  in  a well  of  the 
plate,  mixed,  and  allowed  to  incubate  at  37°C  for  1 hour  before  the 
absorbance  was  read  at  405  nm.  Background  values  were  subtracted  from 
all  absorbances  and  all  treatment  values  were  expressed  as  percentage  of 
the  control.  For  caspase  inhibition  experiments,  cells  were  exposed  to  10 
pM  inhibitor  in  DMSO  (0.1%  final  concentration)  for  30  min  before 
exposure  to  control  or  fatty  acid-enriched  medium. 
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The  linearity  of  the  caspase  assays  was  confirmed  using  p- 
nitroanaline  as  a standard.  Regression  analyses  of  the  resulting  standard 
curves  yielded  lines  with  n >0.990.  Staurosporin  was  used  as  a positive 
control  and  only  those  assays  that  showed  staurosporin-induced  caspase 
activity  were  analyzed  further. 

Isolation  of  plasma  membranes 

After  48  h of  culture  in  either  normal  (control)  medium  or  medium 
enriched  with  0.3  mM  DHA,  cell  cultures  were  disrupted  by  sonication 
and  the  resulting  homogenate  was  fractionated  by  the  centrifugation 
protocol  of  Kaduce  et  al.  (1977)  using  the  buffers  of  Molnar  et  al.  (1969) 
as  described  by  Williams  et  al.  (1998;  1999).  Briefly,  T27A  cells  were 
collected  by  centrifugation  (500  grav  for  15  min),  resuspended  in  0.25  M 
sucrose  buffer  (0.25  M sucrose,  40  nrM  NaCl,  100  mM  KC1,  5 mM 
MgSO-t.  7 HrO,  20  mM  Trizma  base,  pH  7.2  with  HC1),  and  disrupted  (on 
ice)  by  sonication  for  2 x 35  sec  using  a tip-type  sonicator  (Fisher 
Scientific  Model  500,  35  seconds,  pulse  on  1 sec,  pulse  off  1.5  sec).  The 
cell  homogenate  was  centrifuged  at  27  kgrav  for  10  min  to  remove 
undisrupted  cells  and  cellular  debris  and  the  supernatant  over  the  resulting 
pellet  was  spun  for  1 hour  at  105  kgrav  to  produce  a mixed  membrane 
pellet.  The  mixed  membrane  pellet  was  layered  onto  a pad  of  1.1  M 
sucrose  (remaining  composition  as  above)  and  spun  at  107  kgrav  for  16 
hours.  The  white  interfacial  material  was  collected  and  washed  twice  in 
excess  PBS.  The  resulting  membrane  represents  a better  than  8-fold 
purification  of  plasma  membrane  over  the  mixed  membrane  fraction 
(Kaduce  et  al.  1977)  and  has  been  used  in  previous  studies  to  determine 
the  effects  of  DHA  on  membrane  structure  and  composition  in  T27A  cells 
(Williams  et  al.  1998;  Williams  et  al.  1999). 

Lipid  extraction  and  gas  chromatography  of  membrane  fatty  acids 

Total  lipids  where  extracted  from  whole  cell  preparations  and 
from  isolated  plasma  membranes  using  CHCI3/CH3OH  (Bligh  and  Dyer 
1959)  and  concentrated  under  a stream  of  dry  N2  gas.  Phospholipids 
separated  from  neutral  lipids  ( e.g .,  triacylglycerols)  by  silicic  acid 
chromatography  (Wren  1960;  Williams  and  Somero  1996)  were 
transesterified  into  FAMEs  using  methanolic  sodium  methoxide  (Eder  et 
al.  1992).  FAMEs  were  resolved  using  a 0.25  mm  x 30  nr  HP-23  cis/trans 
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FAME  column  in  a Hewlett-Packard  6890  gas  chromatograph.  The 
instrument  was  programmed  to  produce  a temperature  ramp  from  1 80°C 
to  240°  at  2°C/min  starting  2 minutes  after  sample  injection.  Peaks 
corresponding  to  individual  FAMEs  were  identified  by  comparison  ot 
retention  times  to  those  of  authentic  standards.  Peak  areas  were  calculated 
using  Hewlett-Packard’s  ChemStation  software. 

Statistics 

Statistical  analyses  were  carried  out  using  version  2.15.3  of  R (R 
Development  Core  Team,  2008;  http://www.r-project.org/). 

Probabilities  < 0.05  were  considered  significant  (and  labeled  *).  Percent 
data  were  arcsine  transformed  (sin'Wproportion)  before  statistical 
analyses  as  recommended  (Sokal  and  Rohlf  1981).  The  normality  of 
distribution  of  each  data  set  was  assessed  using  the  Shapiro-Wilk  test.  The 
homogeneity  of  variances  among  data  sets  was  tested  using  Fligner- 
Killeen  test  as  it  has  been  shown  to  be  least  sensitive  to  departures  from 
normality  (Conover  et  al.  1981).  The  slopes  of  regression  lines  were 
compared  to  each  other  and  to  slope  = 0 using  the  linear  model  function 
of  R.  Group  means  were  compared  using  one-way  analysis  of  variance 
(ANOVA)  followed  by  Tukey's  HSD  mean  separation  test,  or  where 
appropriate,  the  Kruskal-Wallis  test  followed  by  Wilcoxon  rank  sum  tests. 

Results 

When  cells  of  the  murine  leukemia  line  T27A  were  cultured  in 
media  supplemented  with  DHA,  they  took  up  the  fatty  acid  and 
incorporated  it  into  cellular  phospholipids  (Table  1).  In  phospholipids 
isolated  from  whole  cells,  DHA  levels  were  25  times  that  found  in  control 
cells.  The  increased  proportion  of  DHA  was  associated  with  a large 
reduction  in  the  proportions  of  stearic  acid  (18:0)  and  the  n-6  isomer  of 
18:3.  Proportions  of  palmitic  acid  (16:0)  and  oleic  acid  (18:1)  increased. 
By  contrast,  DHA  incorporation  into  phospholipids  of  the  plasma 
membrane  represented  an  1 8-fold  increase  over  that  of  control  cells  and 
resulted  in  a final  proportion  of  DHA  almost  twice  that  observed  in 
phospholipids  isolated  from  whole  cells.  In  the  plasma  membrane,  DHA 
largely  displaced  oleic  acid  (1 8:1),  as  well  as  arachidonic  acid  (20:4)  and 
other  long  chain  polyunsaturated  fatty  acids.  The  DHA-induced  alteration 
of  membrane  lipid  composition  of  both  whole  cells  and  plasma  membrane 
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is  reflected  in  the  near  inversion  of  the  n-6/n-3  ratios  (Table  1)  after 
treatment  with  DHA. 

The  dramatic  accumulation  of  DHA  in  phospholipids  of  the 
plasma  membrane  of  T27A  cells  can  also  be  clearly  seen  when  comparing 
the  ratios  of  palmitate  to  stearate  (16:0/18:0)  and  of  DHA  to  stearate 
(22:6/1 8:0)  in  phospholipids  extracted  from  plasma  membrane  and  whole 
cells  cultured  in  control  versus  DHA-enriched  media  (Figure  1).  The 
results  shown  in  Table  1 and  Figure  1 closely  mirror  previously  reported 
observations  on  the  effects  of  DHA  on  the  lipid  composition  of  plasma 
membranes  isolated  from  these  cells  (Zerouga  et  al.  1996;  Williams  et  al. 
1998;  Williams  et  al.  1999)  and  indicate  that  the  experiments  presented 
here  both  compliment  and  expand  those  earlier  works. 

Table  1.  The  distribution  of  phospholipid  fatty  acids,  as  percent  of  total 
fatty  acids,  extracted  from  whole  cells  and  from  isolated  plasma 
membranes  after  48  h of  culture  in  control  or  DHA-enriched  (0.3  rnM) 
medium.  Minor  fatty  acids,  i.e.  those  comprising  less  than  1%  of  the  total, 
are  excluded  from  the  analysis.  The  data  represent  the  means  of  two 
independent  experiments. 
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Figure  1.  The  ratios  of  mean  values  of  palmitate  to  stearate  (16:0/18:0)  and  of  DHA  to 
stearate  (22:6/18:0)  in  phospholipids  extracted  from  whole  cells  or  plasma  membrane 
(PM)  after  the  cells  had  been  cultured  for  48  h in  control  medium  or  in  medium 
containing  0.3  mM  DHA. 


Figure  2.  The  density  of  T27A  cells  three  days  after  exposure  to  the  indicated 
concentrations  of  fatty  acid  and  expressed  as  a percentage  of  control.  Squares,  OA; 
circles,  DHA.  Linear  modeling  revealed  that  the  slope  of  the  regression  of  the  OA 
response  is  not  significantly  different  from  zero.  The  slope  of  the  regression  of  the  DHA 
response  is  highly  significantly  different  from  both  slope  = 0 and  the  OA  response  (p  < 
0.001  in  both  cases).  Each  point  represents  the  mean  ± 1 standard  error  of  the  mean  from 
n = 7-14  (DHA)  or  n = 3-6  (OA)  independent  determinations  of  different  cultures. 
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Culture  in  DHA-enriched  medium  caused  a significant  reduction 
in  the  rate  of  leukemic  cell  proliferation.  Figure  2 shows  a DHA-dose 
dependent  reduction  in  cell  density  compared  to  control  cultures  and  to 
cultures  similarly  exposed  to  OA.  The  proportion  of  viable  cells  in  the 
DHA-enriched  cultures  also  fell  significantly,  while  the  viability  of  cells 
in  cultures  exposed  to  OA  remained  indistinguishable  from  that  of  the 
controls  (Figure  3).  Together  these  data  show  that  DHA  caused  significant 
cell  death  over  a three  day  exposure  to  concentrations  of  DHA  in  the 
culture  medium  from  0.3  to  0.9  mM. 

Phase  contrast  microscopy  revealed  that  unlike  control  cells  or 
cells  cultured  in  OA-enriched  medium,  cells  cultured  in  DHA-enriched 
medium  were  irregularly  shaped  and  exhibited  conspicuously  higher 
internal  complexity  including  extensive  cytoplasmic  vacuolization.  In 
addition,  the  external  surfaces  of  control  cells  and  of  cells  cultured  in  OA- 
enriched  medium  were  even  and  regular,  whereas  the  surfaces  of  cells 
cultured  in  DHA-enriched  medium  were  uneven  and  displayed  numerous 
exvaginations  of  the  plasma  membrane  (commonly  referred  to  as  “blebs”; 
e.g.  Charras  2008).  Figure  4 shows  that  the  percentage  of  T27A  cells 
exhibiting  blebs  increased  steadily  with  DHA  dose  until  at  the  highest 
doses  tested  these  structures  appeared  on  nearly  75%  of  all  cells  present 
in  the  culture. 

Culture  of  T27A  cells  for  16  h in  a medium  containing  0.61  mM 
DHA  resulted  in  a significant  elevation  of  the  activities  of  caspases-3,  -8, 
and  -9  (Figure  5).  When  cell  cultures  were  individually  treated  with  10 
pM  of  an  inhibitor  specific  for  each  of  these  caspases  for  30  min  prior  to 
culture  in  DHA-enriched  medium  they  did  not  undergo  cell  death  and  cell 
densities  were  similar  to  those  of  control  cultures  (Figure  6). 
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Figure  3.  The  viability  of  T27A  cells  as  assessed  by  trypan  blue  exclusion  three  days 
after  exposure  to  the  indicated  concentrations  of  fatty  acid.  The  slope  of  the  regression 
of  the  OA  response  is  not  significantly  different  from  zero  and  that  of  the  regression  of 
the  DHA  response  is  highly  significantly  different  from  both  slope  = 0 and  the  OA 
response  (p  < 0.001  in  both  cases).  Squares,  OA;  circles,  DHA.  Each  point  represents  the 
mean  ± 1 standard  error  of  the  mean  from  n = 3 independent  determinations  of  different 
cultures. 
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Figure  4.  The  percentage  of  T27A  cells  exhibiting  plasma  membrane  exvaginations 
(“blebs”,  inset)  after  1 6 h as  a function  of  the  concentration  of  fatty  acid  in  the  culture 
medium.  Square,  OA;  circles,  DHA.  Each  point  represents  the  mean  ± 1 standard  error 
of  the  mean  from  n = 3 independent  determinations  of  different  cultures. 
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Figure  5.  The  effect  of  DHA  on  cellular  caspases.  The  activities  of  caspases  (casp-)  3,  8, 
and  9 in  T27A  cells  after  16  h of  culture  in  normal  medium  (control)  and  in  medium 
containing  0.61  mM  DHA.  The  activity  of  caspase-9  is  significantly  (*,  p < 0.05) 
different  from  the  control  value.  The  activities  of  caspases  -3  and  -8  are  not  significantly 
different  from  caspase-9  and  are  marginally  significantly  (0.05  < p < 0.1)  different  from 
the  controls.  The  data  are  presented  as  percent  of  activity  found  in  control  cells  and 
represent  the  means  ± 1 standard  error  of  the  mean  from  n = 3 (caspase-3)  or  n = 4 
independent  assays  using  separate  cell  cultures. 


inhibitor  then  Dl IA 


Figure  6.  Density  of  T27A  cell  cultures  expressed  as  a percentage  of  that  in  control  flasks 
after  48  hours  in  the  presence  of  medium  containing  no  additions,  0.1%  (vol/vol)  DMSO 
(carrier  control),  and  medium  enriched  with  0.61  mM  DHA.  These  are  compared  to  the 
densities  of  cell  cultures  exposed  for  30  min  to  10  (iM  of  an  inhibitor  specific  to  each 
one  of  the  indicated  caspases  before  the  48  h exposure  to  medium  containing  0.61  mM 
DHA.  Each  bar  represent  the  mean  ± 1 standard  error  of  the  mean  from  n = 12  (control, 
DMSO,  and  DHA)  or  n = 4 (inhibitors)  independent  cultures  and  assays.  The  bar  labeled 
with  the  asterisks  is  significantly  (p  < 0.05)  different  from  the  control  value. 
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Discussion 

T27A  is  a line  of  murine  B lymphoblast  cells  with  well-described 
susceptibility  to  DHA-induced  cell  death  (Zerouga  et  al.  1996;  Kafrawy 
et  al.  1 998).  Under  normal  growth  conditions  they  possess  very  little  DHA 
(Table  1).  During  culture  in  media  supplemented  with  DHA,  T27A  cells 
incorporated  considerable  amounts  of  the  fatty  acid  into  phospholipids  of 
their  plasma  membrane  (Table  1,  Figure  1).  Delivering  exogenous 
metabolites  and  drugs  to  cells  as  albumin  conjugates  is  thought  to  simulate 
physiological  delivery  conditions  and  buffers  the  availability  of  the 
delivered  substance.  Other  advantages  of  using  albumin  as  a biological 
carrier  molecule  are  described  elsewhere  (Kratz  2008;  Elsadek  and  Kratz 
2012). 

In  phospholipids  extracted  from  whole  cells,  DHA  increased  from 
less  than  one-half  of  1%  of  the  total  in  control  cells  to  over  10%  of  total 
phospholipid  fatty  acids  in  cells  cultured  with  supplemental  DHA.  In 
purified  plasma  membrane  preparations  the  percentage  rose  from  close  to 
1%  to  over  18%.  These  results  agree  well  with  previous  data  from  these 
cells  (Williams  et  al.  1998)  and  with  reports  showing  that  DHA  has  a 
powerful  effect  on  both  the  composition  and  structure  of  their  plasma 
membranes  (Zerouga  et  al.  1996;  Zerouga  et  al.  1997;  Williams  et  al. 
1998;  Williams  et  al.  1999).  These  observations  suggest  that  the 
metabolism  of  fatty  acids  in  these  leukemic  cells  favors  the  non-random 
incorporation  of  DHA  into  cell  membranes  with  a preferential 
incorporation  of  DHA  into  phospholipids  of  the  plasma  membrane. 
Preferential  incorporation  of  DHA  into  the  plasma  membranes  of  T27A 
cells  has  been  observed  previously  (Jenski  et  al.  1993;  Pascale  et  al.  1993; 
Williams  et  al.  1998;  Williams  et  al.  1999). 

Culture  of  T27A  cells  in  DHA-enriched  media  caused  a dose- 
dependent  decrease  in  cell  density  and  cell  viability,  an  increase  in  the 
percent  of  cells  exhibiting  blebs,  and  the  activation  of  cellular  caspases. 
OA  did  not  induce  these  effects  (Figures  2 and  3).  We  chose  OA  as  the 
control  fatty  acid  for  this  study  because  it  is  the  most  abundant  fatty  acid 
in  many  cell  types,  it  is  not  toxic  to  T27A  cells  (Kafrawy  et  al.  1 998)  and 
because  in  other  cells  types  and  in  model  membranes  it  neither  induces 
apoptosis  nor  influences  membrane  raft  function  or  structure  (Kishida  et 
al.  2006;  Shaikh  et  al.  2009;  Shaikh  et  al.  2009a).  Figures  2 and  3 show 
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that  when  cells  were  cultured  in  media  containing  DHA  at  concentrations 
above  approximately  0.3  mM,  their  rate  of  proliferation  was  slowed  and 
significant  numbers  of  cells  died.  The  observation  of  induction  of  cell 
death  near  0.3  mM  is  consistent  with  what  we  and  others  have  found  in 
this  cell  line  (Zerouga  et  al.  1996;  Williams  et  al.  1998;  Williams  et  al. 
1999),  and  may  have  implications  for  human  health.  In  a study  of  healthy 
men  and  women  the  serum  concentration  of  DHA-phospholipids  was 
found  to  be  near  0.15  mM.  That  concentration  rose  to  over  0.35  mM  after 
six  weeks  of  dietary  supplementation  with  DHA  capsules  (Conquer  and 
Holub  1998).  In  a separate  study  of  234  healthy  men,  the  mean  serum 
concentration  of  DHA-phospholipids  was  0.18  mM  and  was  elevated  to 
over  0.31  mM  by  similar  DHA  capsule  supplementation  (Grimsgaard  et 
al.  1997).  These  studies  show  that  the  DHA  levels  used  to  reduce  the 
growth  and  proliferation  of  mouse  leukemia  cells  in  vitro  can  be  achieved 
in  humans  by  dietary  manipulation. 

At  all  concentrations  of  DHA  examined,  cells  exhibited  distinct 
exvaginations  or  blebs  on  their  plasma  membranes  (Figure  4).  Though  the 
significance  of  these  structures  is  not  well  understood,  they  are  widely 
recognized  as  a hallmark  of  apoptosis  (Charras  2008).  The  definitive 
indicator  of  apoptosis  is  the  presence  of  active  caspases  (Galluzzi  et  al. 
2011)  and  in  these  cells  culture  in  medium  containing  0.61  mM  DHA 
resulted  in  the  activation  of  caspases  -3,  -8,  and  -9  (Figure  5).  These 
observations  establish  that  DHA  induces  apoptosis  in  T27A  cells. 

In  general,  apoptosis  can  be  triggered  by  two  separate,  but  linked, 
pathways:  the  intrinsic  and  extrinsic  pathways  (Portt  et  al.  2011;  Galluzzi 
et  al.  2011).  The  intrinsic  pathway  originates  with  mitochondria  and 
involves  the  release  from  the  intermembrane  space  of  pro-apoptotic 
molecules,  particularly  cytochrome  c.  The  released  cytochrome  c initiates 
a series  of  events  that  result  in  the  conversion  of  inactive  procaspase-9 
into  active  caspase-9.  Caspase-9  then  activates  caspase-3  which  is 
responsible  for  setting  off  the  series  of  down-stream  events  characteristic 
of  apoptosis.  The  extrinsic  pathway  involves  death  receptors  located  in 
the  plasma  membrane  of  the  cell.  Binding  of  an  appropriate  ligand  to  a 
death  receptor  initiates  an  apoptotic  cascade  that  begins  with  the 
conversion  of  inactive  procaspase-8  into  active  caspase-8.  Depending  on 
the  type  of  cell,  caspase-8  then  activates  caspase-3  directly  or  indirectly 


Washington  Academy  of  Sciences 


27 


by  converting  the  protein  Bid  into  tBid  which  activates  caspase-9  (Portt 
et  al.  2011;  Galluzzi  et  al.  2011). 

Caspases  -3,  -8,  and  -9  are  all  active  in  T27A  cells  after  exposure 
to  DHA  (Figure  5)  and  the  inhibition  of  any  one  of  them  prevents  the  cells 
from  undergoing  apoptosis  (Figure  6).  Since  caspase-3  is  an  effector 
caspase  acting  downstream  of  the  initiator  caspases  -8  and  -9,  it  appears 
that  a linear  cascade  of  activation  events  occurs  whereby  one  initiator 
caspase  activates  the  other  (i.e.  either  caspase-8  activates  caspase-9  or 
vice  versa)  and  the  latter  then  activates  caspase-3.  In  some  cell  types  both 
caspases  -8  and  -9  are  able  to  activate  caspases-3  directly  (Slee  et  al.  1 999; 
Peter  and  Krammer  2003),  but  apparently  in  T27A  cells  under  the 
conditions  used  here  one  of  these  caspases  is  unable  to  do  so.  It  is  possible 
that  one  of  the  initiator  caspases  (-8  or  -9)  activates  capsase-3,  then 
caspase-3  activates  the  remaining  initiator  caspase  (Ozoren  and  El-Deiry 
2003),  but  this  is  also  not  the  case  here  because  the  activation  of  caspase- 
3 initiates  the  irreversible  stages  of  apoptosis  (the  execution  pathway)  and 
thus  the  inhibition  of  the  initiator  caspase  that  was  activated  by  caspases- 
3 would  not  result  in  the  rescue  from  cell  death  shown  in  Figure  5.  The 
data  presented  here  suggest  that  one  of  the  initiator  caspases  activates  the 
other  yet  is  itself  unable  to  activate  caspase-3.  These  results  are  consistent 
with  T27A  cells  belonging  to  the  type  II  group  of  apoptotic  cells  (Scaffidi 
et  al.  1998;  Ozoren  and  El-Deiry  2002).  In  cells  able  to  undergo  type  I 
apoptosis,  death  receptor/ligand  binding  results  in  the  direct  activation  of 
effector  caspases  like  caspase-3.  Most  cells  undergo  type  II  apoptosis,  in 
which  death  receptor/ligand  binding  is  indirectly  linked  to  the  activation 
of  effector  caspases  through  the  mitochondrion-dependent  pathways  via 
Bid  and  tBid  (Scaffidi  et  al.  1998;  Ozoren  and  El-Deiry  2002;  Blanarova 
et  al.  2011).  These  observations  are  consistent  with  a pathway  in  T27A 
cells  in  which  DHA  induces  apoptosis  by  first  triggering  caspase-8  which 
in  turn  activates  caspase-9  to  initiate  the  effector  caspases. 

Lipid  rafts  are  dynamic  and  ephemeral  laterally  segregated 
assemblies  of  the  plasma  membrane  that  are  rich  in  sphingolipids, 
cholesterol,  and  acylated  and  glycosylphosphatidylinositol  (GPI)- 
anchored  proteins  (Simons  and  Ikonen  1997;  Lingwood  and  Simons 
2010).  Lipid  rafts  serve  as  important  platforms  for  the  regulation  of  cell 
processes  by  confining  and  concentrating  receptors  and  enzymes  from  the 
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surrounding  membrane.  They  represent  another  selective  cellular 
compartment  that  can  co-localize  and  modulate  the  activities  of  these 
proteins  (Simons  and  Ikonen  1997;  Lingwood  and  Simons  2010).  Death 
receptors,  a subset  of  the  tumor  necrosis  factor  receptor  superfamily,  are 
among  those  receptors  that  have  been  shown  to  be  regulated  by  lipid  rafts. 
They  include  tumor  necrosis  factor  receptor- 1 (TNF-R1,  p55),  death 
receptor  (DR)  3 (WSL-l/APO-3),  DR4  (tumor  necrosis  factor-related 
apoptosis-inducing  ligand  receptor- 1 [TRAIL-R1]),  DR5  (TRAIL- 
R2/APO-2),  DR6  and  CD95  (Fas/APO-1)  (Ashkenazi  and  Dixit  1998; 
Lavrik  2011).  These  receptors  initiate  extrinsic  apoptosis  after  ligand 
binding  or  ligand-independent  clustering  of  receptors  (Fumarola  et  al. 
2001;  Scheel-Toellner  et  al.  2004).  Only  when  located  within  lipid  rafts 
do  death  receptors  facilitate  the  activation  of  caspase-8  and  down-stream 
events  leading  to  apoptosis.  Death  receptors  do  not  activate  caspase-8 
when  located  in  non-raft  regions  of  the  membrane  (Xu  et  al.  2009;  Gajate 
et  al.  2009;  Blanarova  et  al.  2011). 

Lipid  raft  dysfunction  has  previously  been  implicated  in  the  DHA- 
induced  cell  death  of  T27A  cells  (Williams  et  al.  1998;  Williams  et  al. 
1999).  Other  work  has  shown  that  DF1A  alters  the  structure  (Wassail  and 
Stillwell  2008),  size  (Chapkin  et  al.  2008;  Rockett  et  al.  2012)  and  protein 
composition  (Rogers  et  al.  2010)  of  lipid  rafts.  Recently,  Shaikh's  group 
has  shown  that  DHA  has  profound  effects  on  mammalian  immune 
function  and  that  these  effects  arise  from  the  influence  of  DHA  on  the 
lipid  rafts  of  B cells  (Rockett  et  al.  2012;  Gurzell  et  al.  2013).  Other 
evidence  convincingly  shows  that  DHA  alters  the  raft-localization  of 
epidermal  growth  factor  receptor  (Schley  et  al.  2007;  Rogers  et  al.  2010), 
caveolin-1  (Li  et  al.  2007),  toll-like  receptors  (Wong  et  al.  2009),  the 
major  histocompatibility  complex  (MHC)  class  I proteins  (Ruth  et  al. 
2009;  Shaikh  et  al.  2009),  the  signaling  molecules  SFK,  Lck,  Fyn,  and  c- 
Yes  (Stulnig  et  al.  1998;  Stulnig  et  al.  2001;  Chen  et  al.  2007),  the 
interleukin-2  receptor  (Li  et  al.  2005),  phospholipase  D1  (Diaz  et  al. 
2002),  endothelial  nitric  oxide  synthase  (Li  et  al.  2007;  Matesanz  et  al. 
2010),  and  protein  kinase  C (Fan  et  al.  2004).  Combined  with  the  data 
presented  here,  these  observations  suggest  that  DHA  has  an  influence  on 
death  receptor-mediated  apoptosis  via  an  action  on  lipid  rafts.  This 
conclusion  is  reinforced  by  studies  showing  that  a number  of  structurally 
diverse  anti-tumor  agents  selectively  induce  apoptosis  in  cancer  cells  by 
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triggering  apoptosis  thorough  Fas-clustering  in  lipid  rafts  (Xu  et  al.  2009; 
Mollinedo  et  al.  2010;  Blanarova  et  al.  2011). 

Conclusions 

When  T27A  leukemic  cells  are  cultured  in  media  enriched  with 
DHA,  the  cells  take  up  the  fatty  acid  and  incorporate  it  into  their 
membranes,  particularly  the  plasma  membrane.  Culture  in  DHA-enriched 
medium  also  causes  cell  death  by  inducing  apoptosis.  This  induction  of 
apoptosis  is  caused  by  the  initiation  of  the  extrinsic  apoptotic  pathway  and 
with  a linear  activation  of  caspases  in  the  sequence  caspase-8,  caspase-9, 
then  caspase-3.  Coupled  with  previous  observations  by  us  and  others,  the 
data  suggest  that  the  first  step  in  DHA-induced  cell  death  in  T27A 
leukemia  cells  involves  the  activation  of  plasma  membrane-associated 
death  receptors  by  an  influence  of  DHA  on  lipid  rafts. 
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Abstract 

Leporids  (rabbits  and  hares)  were  widely  sought-after  game  animals  to 
many  people  in  the  nineteenth  century.  But  how  often  were  offenses  to 
the  game  laws  caught?  That  answer  depends  on  the  number  of  wardens, 
the  amount  of  prosecuted  leporid  offenses  (as  compared  to  other 
offenses),  and  how  complex  it  was  to  catch  an  offense.  The  aim  of  this 
paper  is  to  determine  the  types  of  offenses  that  game  wardens  enforced, 
the  number  of  prosecuted  leporid  offenses,  the  specific  types  of  leporid 
offenses  in  two  states  — New  Jersey  and  Massachusetts  — and  the  New 
Jersey  counties  where  leporid  enforcements  occurred.  This  investigation 
uncovered  three  key  findings:  (1)  The  more  uniformity  of  time  spent 
between  wardens  on  offenses  could  be  a success  factor  in  catching 
leporid  offenses;  (2)  There  was  a correlation  between  the  number  of 
wardens  who  cited  leporid  offenses  and  the  number  of  counties  involved 
with  the  leporid  offenses;  and  (3)  Outside  of  those  years,  there  was  a 
disproportionate  number  of  leporid  offenses  when  correlating  for  the 
numbers  of  wardens  and  numbers  of  counties.  Furthermore,  the  results 
of  this  investigation  offer  implications  toward  our  understanding  of  past 
leporid  conservation,  most  notably  findings  related  to  the  uniformity  of 
the  number  of  wardens  prosecuting  leporid  offenses  and  the  years  when 
prosecuted  leporid  offenses  were  prominent  in  the  nineteenth  century. 

Introduction 

The  historical  emphasis  of  managing  wildlife  was  largely  synonymous 
with  managing  game  species  and  predators  (Bolen  and  Robinson,  2003,  pp. 
20,  183).  For  example,  the  first  game  laws  in  North  America  occurred  in 
1639  to  close  the  white-tailed  deer  hunting  season  for  six  months  (Bolen 
and  Robinson,  2003).  Protection  of  game  is  critical  in  efforts  to  conserve 
wildlife  because  of  the  need  to  understand  biology  at  the  organismal  level 
and  to  conserve  habitats  (Willis  et  al.,  2008). 

However,  the  very  beginnings  of  enforcement  of  wildlife  laws  in  the 
United  States  tell  a rugged  story.  Beginning  with  colonial  times,  game  law 
enforcers  intended  that  private  citizens  act  as  “advisors'”  regarding  game 
laws  to  ensure  that  fellow  hunters  were  following  the  laws  (Lund,  1980), 
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which  would  mostly  help  to  inform  the  privileged  class  in  how  to  protect  or 
manage  game  resources  on  their  own  lands.  Yet  if  private  citizens  were 
enforcing  the  game  laws,  how  effective  were  they  at  protecting  wild  game? 
In  historical  analyses  of  wildlife  enforcement  during  the  nineteenth  century, 
Lund  (1980),  Tober  (1981),  and  Stockdale  (1993)  indicated  that 
enforcement  of  laws  pertaining  to  wildlife  was  weak  or  non-existent,  since 
it  was  the  people  who  were  conducting  the  enforcements  and  this  perception 
was  criticized  by  the  local  townspeople.  Early  wildlife  laws  too,  ignored  the 
bag  limit,  further  making  enforcement  of  the  laws  difficult  (Lund,  1980). 

As  enforcement  of  laws  and  compliancy  is  important  to  the 
conservation  of  wildlife,  the  historical  nineteenth-century  rationale  or 
prioritization  of  the  protection  of  leporids  (rabbits  and  hares)  in  New  Jersey 
and  Massachusetts  (the  two  states  that  are  the  focus  of  this  study)  appears 
unclear1.  What  offenses  did  the  wardens  focus  their  time  on?  And  which 
counties  were  enforcing  the  leporid  laws?  These  questions  lead  to  further 
questions  as  to  whether  certain  counties  in  New  Jersey  and  Massachusetts 
reported  more  leporid  offenses  than  others,  and  what  factors  were  most 
strongly  associated  with  whether  and  how  game  law  violations  were 
reported. 

Furthermore,  research  regarding  wildlife  conservation  officers  is 
limited  (Archbold,  2012)  so  there  is  a need  for  more  of  a historical 
underpinning  on  the  number  of  conservation  offenses,  the  number  of 
wardens,  and  the  county  distribution  of  the  offenses. 

These  questions  are  important  because  leporids  were  often  hunted 
for  food  ( Omaha  Daily  Bee , 1887,  Si.  Paul  Daily  Globe , 1887)  at  a time 
when  some  North  American  leporid  populations  started  declining.  Let’s 
look  at  the  two  states  which,  again,  are  the  focus  of  this  study  - New  Jersey 
and  Massachusetts. 

The  leporids  of  New  Jersey  include  the  Eastern  cottontail 
(Sylvilagus  floridanus ),  introduced  species  of  the  European  hare  ( Lepus 
europaeus),  the  black-tailed  jackrabbit  (Lepus  calif  or  nicus),  the  white- 
tailed jackrabbit  (Lepus  townsendii ) (State  of  New  Jersey,  2004)  and,  at  one 
time,  the  snowshoe  hare  (Lepus  americanus ) (Rhoads,  1903).  However, 
current  records  indicate  no  presence  of  the  snowshoe  hare  in  New  Jersey 
(Murray  and  Smith,  2008). 
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The  leporids  of  Massachusetts  also  include  the  Eastern  cottontail, 
snowshoe  hare,  and  black-tailed  jackrabbit  (Massachusetts  Executive 
Office  of  Energy  and  Environmental  Affairs,  2014).  In  Massachusetts,  the 
New  England  cottontail  ( Sylvilcigus  transUionalis ) currently  has  a 
conservation  status  of  candidate  species  (USFWS,  2015). 

The  purpose  of  this  paper  is  to  determine  the  types  of  leporid 
offenses;  the  number  of  prosecuted  leporid  offenses;  the  types  of  offenses 
game  wardens  enforced;  and  the  counties  that  prosecuted  the  enforcements 
— rather  than  to  decipher  why  game  wardens  focused  on  specific  leporid 
species  offenses. 


Methods 

Based  on  the  Annual  Reports  of  the  Board  of  Fish  and  Game 
Commissioners  in  New  Jersey  (1894-1899)  and  the  Reports  of  the 
Commissioners  on  Inland  Fisheries  and  Game  in  Massachusetts  (1889- 
1 899),  I categorized  the  nineteenth-century  wildlife  offenses  for  those  two 
states  into  these  eight  groups:  fish,  illegal  fishing,  lobster,  Sunday  offenses, 
pollution,  illegal  game,  trespassing,  and  any  offenses  related  to  game 
generally,  such  as  squirrels,  deer,  ducks,  birds,  and  leporids.  Table  1 
presents  a description  of  each  of  these  eight  categories. 

Each  year  from  1894-1899  for  New  Jersey  and  1889-1899  for 
Massachusetts,  I recorded  the: 

• total  number  of  wardens  who  were  holding  a warden  status  in  New 
Jersey  and  Massachusetts, 

• total  number  of  offenses2, 

• number  of  fish,  illegal  fishing,  Sunday  hunting,  pollution,  illegal 
game,  trespassing,  and  other  game  offenses  (including  the  number 
of  wardens  who  cited  leporid  offenses), 

• number  of  counties  with  leporid  offenses, 

• names  of  the  wardens  who  cited  leporid  offenses,  and 

• counties  in  which  the  offenses  occurred. 

The  Simpson’s  E or  Even-ness  index  (widely  employed  for 
biodiversity  studies),  was  used  to  determine  the  “even-ness”  of  the 
prosecuted  leporid  offenses  and  how  “evenly”  the  leporid  offenses  occurred 
in  each  county  (National  Center  for  Ecological  Analysis  and  Synthesis 
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(NCEAS),  2014).  Simpson’s  E can  range  from  0 to  1,  with  1 being  the 
highest  uniformity  (NCEAS,  2014).  Because  there  was  usually  only  one 
warden  spending  time  on  leporid  offenses  in  Massachusetts,  the  even-ness 
index  had  to  include  the  total  number  of  wardens  available,  regardless  of 
whether  all  the  wardens  worked  on  game  such  as  leporids. 

Table  1.  Offenses  pursued  in  New  Jersey  and  Massachusetts  and 
characterization  of  the  offenses 


Offense 

Description  of  the  Offense 

Fish 

Any  fishing  offense  related  to  possession  of  a certain  fish 
species  caught,  or  if  the  fish  was  under  a legal  limit  size 

Illegal  fishing 

Offenses  related  to  the  use  of  nets  or  other  equipment 
used  to  illegally  catch  fish 

Lobster 

Any  offense  related  to  lobster  catching  - e.g.,  size  was 
too  small,  mutilation  was  involved,  offense  was  over  the 
limit 

Sunday  hunting 

Offenses  occurring  on  Sunday  (and  no  hunting  or  fishing 
was  allowed  on  Sunday) 

Pollution 

Any  pollution  that  occurred 

Illegal  game 

Any  attempt  to  take  game  or  the  act  of  possessing  illegal 
game 

Trespassing 

Offenses  where  people  were  caught  trespassing  on 
property  not  belonging  to  them 

Squirrels,  deer, 
ducks,  song  birds, 
or  leporids 

Offenses  where  any  of  these  animals  were  caught  out  of 
season  or  over  the  legal  limit;  any  attempt  to  kill  these 
animals;  the  use  of  certain  illegal  methods  to  catch  these 
animals  (i.e.,  snares) 

Regarding  the  county  data:  In  New  Jersey  only  the  years  1 894-1895, 
1898,  and  1899  contained  county  data.  Massachusetts  county  data  were  not 
applicable  because  county  data  were  sparse,  and  due  to  the  nature  of  the 
historical  documents,  it  was  difficult  to  decipher  where  the  offenses 
originated.  For  the  New  Jersey  data,  correlation  analysis  was  used  to 
determine  the  relationship  between  the  number  of  wardens  who  cited 
leporid  offenses  and  the  number  of  counties  involved  with  leporid  offenses 
in  the  state  of  New  Jersey. 
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Results 

Findings  on  New  Jersey 

In  New  Jersey,  fishing  and  illegal  fishing  activities  were  the  more 
commonly-reported  offenses  in  1894-1895  (49%  of  the  overall  time).  In 
1896,  Sunday  hunting  and  offenses  with  song  birds  were  the  more 
commonly  reported  offenses  (63%  of  the  overall  time).  In  1897,  illegal 
fishing  activities  and  offenses  with  song  birds  were  more  commonly 
reported  offenses  in  the  state  (60%  of  the  time).  By  1898  and  1899,  Sunday 
hunting  and  offenses  with  song  birds  were  more  commonly  reported  (57% 
of  the  time  for  both  years). 

The  above  overall  prosecuted  offenses  provide  the  metrics  by  which 
annual  percentages  of  prosecuted  leporid  offenses  are  calculated  for  the 
state  of  New  Jersey  during  those  same  years  (Figure  1). 


Percentage  of  leporid  offenses  in 
New  Jersey 


Figure  1.  New  Jersey:  Percentage  of  leporid  offenses,  among  other  types  of  offenses. 

In  New  Jersey,  the  number  of  wardens  each  year  did  not 
uniformly  reflect  the  number  of  leporid  offenses.  The  highest  number  of 
wardens  involved  with  leporid  offenses  occurred  in  1896  with  14  wardens. 
At  the  same  time,  1896  had  the  lowest  percentage  of  wardens  reporting 
leporid  offenses  (Figure  1).  Simpson  E values  related  to  the  even-ness  of 
wardens  prosecuting  leporid  offenses  in  each  year  are  reported  in  Table  2. 
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Table  2.  New  Jersey:  Even-ness  of  wardens  prosecuting  leporid  offenses. 


Year 

1894- 

1895 

1896 

1897 

1898 

1899 

Even-ness 

.7475 

.78 

.394 

.809 

.685 

In  New  Jersey,  the  number  of  counties  in  which  leporid  offenses 
were  prosecuted  each  year  did  not  uniformly  reflect  the  total  leporid 
offenses  in  the  state.  The  most  leporid  offenses  (relative  to  other  offenses) 
occurred  in  1 898  and  1 899  (Figure  1);  however,  the  least  number  of  counties 
that  cited  leporid  offenses  occurred  in  1899  (Figure  2).  The  year  1898  had 
the  most  even  number  of  wardens  spending  time  on  leporid  offenses  (Table 
2)  and  1898  is  also  significant  for  the  even-ness  of  the  counties  that  cited 
leporid  offenses  (see  Table  3,  which  shows  Simpson  E values  for  the  even- 
ness of  counties  with  prosecuted  leporid  offenses  in  1894-1895,  1898,  and 
1 899).  The  year  1 899  was  the  least  uniform  in  terms  of  counties  with  leporid 
offenses  (Table  3),  with  Bergen  County  carrying  over  half  of  the  total 
leporid  offenses  (Figure  2). 

The  number  of  wardens  who  prosecuted  leporid  offenses 
correlates  strongly  to  the  number  of  counties  in  which  offenses  were 
prosecuted  in  New  Jersey.  Correlation  co-efficient  analysis  showed  r = 
0.989  association  between  the  number  of  wardens  who  cited  leporid 
offenses  and  the  number  of  counties  involved  with  the  leporid  offenses.  In 
1 896  and  1 898,  the  number  of  wardens  who  worked  on  leporid  offenses  was 
more  than  50%  of  all  wardens  (Figure  3);  however,  the  most  counties 
prosecuting  leporid  offenses  occurred  in  1898  with  twelve  counties 
involved  (Figure  4).  In  the  years  1894-1895,  eight  counties  were  involved 
in  leporid  offenses  (Figure  5). 


Washington  Academy  of  Sciences 


45 


New  Jersey  counties  with  leporid  offenses  in 
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Figure  2.  New  Jersey  counties  prosecuting  leporid  offenses,  1899. 


Table  3.  New  Jersey:  Even-ness  of  counties  prosecuting  leporid  offenses. 


Year 

1894- 

1895 

1896 

1897 

1898 

1899 

Even-ness 

.7475 

NA 

NA 

.842 

.488 

Wardens  prosecuting  leporid 
offenses  in 
New  Jersey 
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Figure  3.  New  Jersey:  Number  of  wardens  who  prosecuted  leporid 
offenses,  expressed  as  a percentage  of  the  total  number  of  wardens. 
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The  most  prominent  year  for  prosecuting  leporid  offenses  in  New 
Jersey  was  1898,  based  upon  the  previously  discussed  percentage  of 
wardens  and  percentage  of  prosecuted  leporid  offenses  in  New  Jersey 
(Figures  3 and  1),  and  also  based  upon  the  number  of  wardens  prosecuting 
leporid  offenses  being  the  most  uniform  (Table  2). 

The  most  un-even  number  of  wardens  prosecuting  leporid  offenses 
occurred  in  1897,  given  that,  for  example,  Warden  Dunham  carried  44%  of 
the  total  leporid  offenses.  Also,  in  1897,  there  was  a decrease  (from  the 
previous  year)  in  the  number  of  wardens  contributing  to  leporid  offense 
prosecutions  (Figure  3). 


New  Jersey  counties  with  leporid  offenses  in 

1898 


County 


Figure  4.  New  Jersey  counties  prosecuting  leporid  offenses,  1898. 

New  Jersey  had  various  types  of  prosecuted  leporid  offenses,  with 
the  majority  of  the  offenses  involving  killed  rabbits  and  the  lowest  number 
of  offenses  for  selling  rabbits  (Figure  6).  The  methods  used  by  hunters  in 
the  killed-rabbit  offenses  was  not  mentioned. 
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New  Jersey  counties  with  leporid 
offenses  in  1894-1895 
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Figure  5.  New  Jersey  counties  prosecuting  leporid  offenses,  1894-1895. 
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Figure  6.  New  Jersey:  Types  of  leporid  offenses  (killed,  possession  of, 
snared,  snooded,  use  of  ferret,  trapped,  or  sale  of  leporids). 
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Findings  on  Massachusetts 

In  Massachusetts,  the  number  of  wardens  reflects  more  prosecuted 
leporid  offenses.  The  most  number  of  wardens  dealing  with  leporid  offenses 
in  Massachusetts  was  two;  this  occurred  in  1894.  The  year  1894, 
consequently,  had  the  most  prosecuted  leporid  offenses  of  any  year  from 
1889  to  1899  (Figures  7 and  8). 


Percentage  of  wardens 
prosecuting  leporid  offenses  in 
Massachusetts 


Figure  7.  Massachusetts:  Number  of  wardens  who  prosecuted  for  leporid 
offenses,  expressed  as  a percentage  of  the  total  number  of  wardens. 
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Figure  8.  Massachusetts:  Percentage  of  prosecuted  leporid  offenses, 
relative  to  other  types  of  offenses. 
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The  year  1897  had  the  least  uniform  percentage  of  leporid  offenses 
in  Massachusetts.  As  indicated  in  'fable  4,  for  Massachusetts,  1 897  had  the 
lowest  percentage  of  wardens  dealing  with  leporid  offenses  along  with  the 
low  percentage  of  prosecuted  leporid  offenses  previously  shown  in  Figures 
7 and  8. 

Table  4.  Massachusetts:  Even-ness  of  prosecuted  leporid  offenses  by  year. 


Year 

1894 

1895 

1896 

1897 

1899 

Even-ness 

.138 

.077 

.066 

.032 

.071 

When  wardens  had  more  uniform  leporid  offense  prosecutions,  they 
correlated  to  the  number  of  wardens  and  percentage  of  prosecuted  offenses 
for  leporid  offenses.  Massachusetts  was  not  very  uniform  with  regard  to  the 
percentage  of  prosecuted  leporid  offenses  being  far  from  1 with  regard  to 
Simpson’s  E (Table  4).  During  the  years  1895-1899  (with  the  exception  of 
1898  with  no  leporid  offenses),  there  was  only  one  warden  who  prosecuted 
leporid  offenses  in  Massachusetts.  Even  though  only  one  warden  prosecuted 
leporid  offenses  during  those  years,  the  year  1894  had  the  most  uniform 
prosecuted  leporid  offenses  (Table  4)  and  this  reflected  the  highest 
percentage  of  wardens  dealing  with  leporid  offenses  and  the  highest 
percentage  of  leporid  offenses. 

As  previously  noted,  the  year  1897  was  the  least  uniform  with 
prosecuted  leporid  offenses  in  Massachusetts  (Table  4).  The  year  1897  also 
reflected  the  lowest  percentage  of  the  number  of  wardens  dealing  with 
leporid  offenses  in  Massachusetts  and  the  lowest  percentage  of  prosecuted 
leporid  offenses. 


Analysis  and  Implications 

For  both  New  Jersey  and  Massachusetts,  1 897  was  the  least  uniform 
in  terms  of  prosecuted  leporid  offenses  (Tables  2 and  4).  For  New  Jersey, 
this  did  not  mean  that  the  lowest  uniformity  in  1897  reflected  the  lowest 
percentage  of  wardens  or  prosecuted  leporid  offenses.  For  Massachusetts  it 
did  — and  when  wardens  had  more  uniform  annual  prosecutions  for  leporid 
offenses,  there  was  also  the  highest  percentage  of  wardens  and  highest 
percentage  of  prosecuted  leporid  offenses. 
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On  the  one  hand,  the  year  1894  was  the  most  prominent  year  for 
catching  leporid  offenses  in  Massachusetts;  however,  Massachusetts' 
“highest”  point  was  New  Jersey’s  “lowest”  point  (Figures  3 and  7).  On  the 
other  hand,  Massachusetts  had  a higher  percentage  of  prosecuted  leporid 
offenses  than  New  Jersey  did  in  1894,  and  the  following  years  were 
equivalent  (Figures  1 and  8).  This  may  be  because  the  total  numbers  of 
offenses  were  higher  in  New  Jersey  than  in  Massachusetts.  Massachusetts 
focused  more  on  ferret  offenses  (a  total  of  21  offenses)  and  three  offenses 
with  a snare  (Figure  9). 
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Figure  9.  Massachusetts:  Types  of  leporid  offenses  (killed,  possession  of,  snared, 
snooded,  use  of  ferret,  trapped,  or  sale  of  leporids). 

Conclusions  and  Additional  Implications 

The  results  from  this  research  demonstrated  a strong  correlation 
(r  = 0.98)  between  the  number  of  wardens  that  cite  leporid  offenses  and  the 
number  of  counties  involved  with  the  leporid  offenses  in  New  Jersey. 
However,  there  could  be  several  variables  that  impinge  on  the  relationship 
between  the  number  of  wardens  who  cited  leporid  offenses  and  the  number 
of  counties  involved  with  the  leporid  offenses.  For  example,  it  is  unknown 
how  often  the  wardens  traveled  between  counties  or  if  the  warden  remained 
in  the  county  in  which  he  lived.  In  fact,  Warden  Post  of  Somerset  County 
wrote  that  he  conducted  a majority  of  his  inspections  near  where  he  lived  in 
Somerset  County  (Annual  Report  of  the  Board  of  Fish  and  Game 
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Commissioners  of  the  State  of  New  Jersey,  1894).  Likewise,  some  of  the 
counties  had  more  than  one  warden  enforcing  laws  and  in  some  cases,  there 
were  two  wardens.  Furthermore,  it  was  difficult  to  discern  if  the  wardens 
actively  looked  for  leporid  offenses  or  if  they  happened  to  come  across 
offenses. 

It  would  be  useful  to  understand  why,  for  New  Jersey  in  1 899,  there 
was  a higher  percentage  of  prosecuted  leporid  offenses  yet  the  least  number 
of  counties  involved.  It  was  also  the  most  disproportionate  year  in  terms  of 
the  number  of  leporid  offenses  occurring  in  the  counties,  with  Bergen 
County  leading  the  way  with  over  50%  of  the  leporid  offenses  in  1 899.  This 
result  may  be  because  it  was  the  second  lowest  uniformity  for  prosecuted 
leporid  offenses.3 

Further  research  could  try  to  determine  why  some  years  had  more 
offenses  related  to  fishing,  and  then  considerably  fewer  fishing  offenses  in 
1 898  and  1 899.  This  investigation  revealed  that  hunting  on  Sunday  and  song 
bird  offenses  were  relatively  common  prosecutions  for  every  year  studied. 

At  this  point,  it  seems  related  to  comment  on  why  more  prosecutions 
may  have  focused  on  Sunday  hunting  and  song  bird  offenses.  Lund  (1980) 
and  Lueck  (1995)  provided  suggestions  in  terms  of  facilitation  and 
economics  as  to  why  some  offenses  were  enforced  more  than  others.  Lund 
(1980)  described  various  methods  early  legislators  used  to  make 
enforcement  easier  (see  Lund,  1980,  for  more  details).  For  example,  Lund 
(1980)  described  that  closed  seasons  was  easy  to  enforce  ( i.e .,  a closed 
season  could  be  equivalent  to  no  hunting  on  Sunday).  Lueck  (1995) 
discussed  more  of  an  economical  approach  and  suggested  that  by  not  having 
hunting  on  Sunday,  land  could  be  used  for  other  productive  uses  and  the 
prohibitive  hunting  on  Sunday  was  a chance  to  reduce  contracting  costs. 

Also,  Lund  ( 1 980)  proposed  that  it  was  easier  to  make  a prosecution 
when  game  was  being  sold,  rather  than  witnessing  the  offense  in  the 
countryside.  It  is  surprising  that  prosecutions  for  selling  leporid  meat  were 
not  more  prevalent  (i.e.,  this  study  found  that  it  was  the  killed  leporid  or  use 
of  a ferret  that  were  more  dominant  offenses).  It  would  be  insightful  to  note 
the  kinds  of  factors  that  would  make  leporid  offenses,  specifically,  easier  to 
discover. 

Furthermore,  the  years  1894-1899  seemed  to  have  consistently 
fewer  pollution,  illegal  game,  and  trespassing  offenses.  There  may  have 
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been  a lower  number  of  offenses  related  to  illegal  game  because  it  was  a 
somewhat  vague  offense  — citing  only  “game”  and  not  a specific  species.4 
However,  the  lower  number  of  prosecuted  trespassing  offenses  may  be 
because  trespassing  was  an  offense  that  could  be  handled  more  self- 
sufficiently  with  a landowner  (i.e.,  not  needing  a warden)  since  there  was 
the  presumption  that  — unless  owners  posted  notices  saying  hunting  was 
not  allowed  on  the  land  — people  could  continue  to  hunt  (Lund,  1976). 

One  reason  Massachusetts  may  not  have  had  any  leporid  offenses  in 
1 898  may  be  because  the  report  expressed  an  abundance  of  leporids  (Report 
of  the  Commissioners  on  Inland  Fisheries  and  Game,  1898),  so  perhaps  the 
wardens  felt  that  leporid  enforcement  did  not  need  to  be  as  heavily  enforced 
that  year.  For  example.  Wardens  Smith  and  Manly  reported  observing  the 
illegal  use  of  ferrets  during  hunting,  but  also  reported  that  they  did  not  catch 
anyone  in  the  act  of  crime  (Report  of  the  Commissioners  on  Inland  Fisheries 
and  Game,  1898). 

Finally,  as  noted,  this  investigation  revealed  that  the  number  of 
wardens  did  not  always  reflect  more  prosecuted  leporid  offenses  in  New 
Jersey  although  it  did  for  Massachusetts.  Further  research  could  focus  on 
why  more  wardens  did  not  lead  to  more  leporid  prosecutions  in  New  Jersey. 

It  should  also  be  pointed  out  that  future  research  efforts  will  need  to 
take  into  consideration  any  new  findings  regarding  leporid  population 
cycles.  That  is,  some  leporid  populations  are  believed  to  cycle  in  10-year 
intervals  which  could  affect  the  types  of  variables  being  measured  here.  For 
example,  the  snowshoe  hare  species  appears  to  exhibit  10-year  cycles, 
although  other  leporids  have  not  been  researched  as  extensively  in  this 
regard.  The  possible  display  of  a 1 0-year  population  flux  by  other  leporid 
species,  however,  is  a consideration  for  future  related  research. 
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Notes 

1 New  Jersey  and  Massachusetts  were  the  states  highlighted  in  this  study  because  the  data 

were  accessible. 

2 In  the  reports,  there  was  no  distinction  between  offenses  and  prosecutions.  Even  when  a 

case  was  dismissed,  it  was  still  classified  as  an  offense. 

3 The  study  did  not  control  for  Bergen  County  statistically  and  then  recalculate  the 

Simpson’s  E because  the  goal  was  to  examine  the  outstanding  counties,  overall,  not  to 
do  a county  by  county  comparison. 

4 In  terms  of  whether  the  referenced  reports  included  or  did  not  include  leporids, 

unfortunately,  in  many  cases,  historical  articles  are  the  only  resources  available  and, 
in  many  instances,  the  label  "‘game”  may  have  meant  deer,  ducks,  etc.,  which  were 
inexplicably  lumped  into  one  category. 
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Uranus  and  Neptune  Revisited 

Sethanne  Howard 
USNO,  retired 

Abstract 

Uranus  and  Neptune  are  two  planets  not  known  in  ancient  times.  Once 
discovered,  however,  astronomers  were  eager  to  obtain  their  vital 
statistics.  Of  course  once  Voyager  flew  by  them  everyone  had  the 
information,  but  before  there  was  Voyager,  there  were  many  attempts 
to  measure  things  like  the  rotational  period,  the  mass,  the  brightness, 
etc.  This  is  the  story  of  the  two  people  who  obtained  the  rotational 
period  of  both  planets  before  Voyager  got  there.  They  confirmed  that 
the  spin  of  Uranus  is  retrograde  and  that  of  Neptune  direct.  Uranus 
rotates  on  its  side.  Their  estimates  for  the  periods  of  rotation  are,  for 
Uranus,  24  ±3  hr.,  and  for  Neptune,  15  ±3  hr. 

Introduction 

For  most  of  human  history  humanity  knew  of  only  five  planets  (plus 
our  own):  Mercury,  Venus,  Mars,  Jupiter,  and  Saturn.  These  are  the 
planets  that  are  visible  to  the  naked  eye  at  various  times  during  the  year. 
We  learned  relatively  recently  that  there  are  two  other  planets  in  our  Solar 
System:  Uranus  and  Neptune.  Note  the  proper  pronunciation  of  Uranus 
(accent  on  the  first  syllable). 

First  a little  history  on  Uranus  and  Neptune.  The  question  being 
what  did  we  know  before  the  Voyager  flyby  of  Uranus. 

Uranus  had  been  observed  on  many  occasions  before  its 
recognition  as  a planet,  but  it  was  generally  mistaken  for  a star.  Then  Sir 
William  Herschel  observed  the  planet  qua  planet  on  March  13,  1781.  This 
was  the  first  planet  added  to  the  Solar  System  since  the  dawn  of  history. 
See  Figure  1 for  a drawing  of  the  telescope  he  used.  He  decided  to  name 
the  new  planet  Georgium  Sidus  (George’s  Star),  in  honor  of  his  new 
patron,  King  George  III  of  England.  As  one  might  expect,  this  was  not 
popular  outside  England.  Bode  (a  German  astronomer)  opted  for  Uranus , 
the  Latinized  version  of  the  Greek  god  of  the  sky,  Ouranos  (the  only 
planet  with  a name  of  Greek  origin).  Bode  argued  that  just  as  Saturn  was 
the  father  of  Jupiter,  the  new  planet  should  be  named  after  the  father  of 
Saturn.  Ultimately,  Bode’s  suggestion  became  the  most  widely  used,  and 
became  universal  in  1850  when  Her  Majesty’s  Nautical  Almanac  Office, 
the  final  holdout,  switched  from  using  Georgium  Sidus  to  Uranus. 
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Figure  1 — Discovery  telescope  for  Uranus 

Uranus  is  the  seventh  planet  from  the  Sun.  It  has  the  third-largest 
radius  and  fourth-largest  mass  in  the  Solar  System.  It  revolves  around  the 
Sun  once  every  84  Earth  years.  Uranus  has  a ring  system  and  numerous 
moons.  The  Uranian  system  has  a unique  configuration  among  the  planets 
because  its  axis  of  rotation  is  tilted  sideways,  nearly  into  the  plane  of  its 
revolution  about  the  Sun.  Its  north  and  south  poles  therefore  lie  where 
most  other  planets  have  their  equators. 

Uranus’s  orbital  elements  (the  shape  of  its  orbit)  were  first 
calculated  in  1783  by  Pierre-Simon  Laplace.'  Over  time,  discrepancies 
began  to  appear  between  the  predicted  and  observed  orbits,  and  in  1841, 
John  C.  Adams"  first  proposed  that  the  differences  might  be  due  to  the 
gravitational  tug  of  an  unseen  planet."1  In  1 845,  Urbain  Le  Verrierlv  began 
his  own  independent  research  into  Uranus’s  orbit.  On  September  23, 
1 846,  Johann  G.  Gallev  was  the  first  to  see  the  new  planet  close  to  the 
position  predicted  by  Le  Verrier.vl 

There  was  considerable  controversy  over  the  name  for  the  new 
planet.  At  first,  Neptune  was  simply  called  “the  planet  exterior  to  Uranus” 
or  “Le  Verrier’s  planet.”  However,  eventually  the  name  Neptune , Roman 
god  of  the  sea,  was  accepted. 

Neptune  is  the  eighth  and  farthest  planet  from  the  Sun  in  the  Solar 
System.  It  is  the  fourth-largest  planet  by  diameter  and  the  third-largest  by 
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mass.  It  revolves  around  the  Sun  once  every  164.8  Earth  years.  Like 
Uranus,  Neptune  has  a ring  system  and  several  moons. 

Uranus  - the  Details 

Uranus  was  known  to  be  a bit  strange.  It  was  already  suspected 
that  its  axis  of  rotation  was  off  kilter.  Most  of  the  Solar  System  planets 
have  rotation  axes  that  are  close  to  perpendicular  to  the  plane  of  the  orbit. 
The  Earth,  for  example,  has  an  axis  tilted  only  23.5°  off  perpendicular. 
Uranus,  on  the  other  hand,  was  thought  to  have  a rotation  axis  almost  in 
its  orbital  plane.  No  one  was  quite  sure  of  this,  though. 

Its  major  moons  are  Ariel,  Umbriel,  Titania,  Oberon,  and  Miranda 
- names  taken  from  Shakespeare.  There  are  about  27  moons  known. 

The  internal  structure  of  Uranus  is  shown  in  Figure  2.  The  wildly 
off-center  magnetic  field  is  shown  in  Figure  3. 


Figure  2 — Structure  of  Uranus 

The  planet  is  thought  to  have  a very  small  central  almost  rocky 
core,  surrounded  by  a plasma  ocean,  surrounded  in  turn  by  an  atmosphere 
with  lots  of  hydrogen,  helium,  and  methane  (CEE).  It  is  the  methane  that 
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makes  Uranus  appear  cyan  in  color.  Note  in  Figure  3 that  the  Magnetic 
North  Pole  points  “downward”  - below  the  orbital  plane. 

The  magnetic  field  is  not  centered  on  or  near  the  center  of  the 
planet.  This  is  quite  unusual.  This  unusual  geometry  results  in  a highly 
asymmetric  magnetosphere.  By  comparison,  the  magnetic  field  of  Earth 
is  roughly  the  same  at  either  pole,  and  its  “magnetic  equator”  is  roughly 
parallel  with  its  geographical  equator. 


Figure  3 — magnetic  field  for  Uranus  (left)  and  Neptune  (right) 

Neptune  - the  Details 

Figure  4 shows  the  structure  of  Neptune.  Although  smaller  than 
Uranus  as  seen  from  the  Earth,  when  seen  with  a large  telescope  it  is 
visible  as  a disk. 

Voyager  found  the  axial  tilt  of  Neptune  to  be  28.32°  - similar  to 
the  Earth's  tilt.  Triton  is  its  major  moon  - very  large  as  moons  go.  Unlike 
other  large  planetary  moons  in  the  Solar  System,  Triton  has  a retrograde 
orbit,  indicating  that  it  was  captured  rather  than  formed  in  place.  There 
are  about  14  known  moons  for  Neptune.  The  magnetic  field  of  Neptune 
(Figure  3)  is  also  a bit  off  center  although  not  as  much  as  Uranus. 

Never  visible  to  the  naked  eye,  Neptune  requires  a 4 meter  class 
telescope  to  capture  its  spectra,  and  a 50  inch  telescope  to  work  in  the  near 
infrared  part  of  the  spectrum.  To  me  it  is  a beautiful  planet  because  its 
color  is  a deep,  rich  blue.  The  atmosphere  is  mainly  hydrogen  and  helium 
with  trace  amounts  of  CFU  that  contribute  to  that  beautiful  color. 
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Planetary  Rotation 


All  planets  rotate.  The  Earth  rotates  about  its  axis  once  a day.  This 
is  how  we  define  a "day’.  Planets  revolve  around  the  Sun  and  rotate  about 
their  individual  axes.  It  is  fairly  straightforward  to  obtain  the  rotation 
periods  (i.e.,  the  length  of  the  planetary  day)  of  the  five  naked  eye  planets 
- one  simply  watches  them.  We  can’t  watch  the  more  distant  Uranus  and 
Neptune.  It  takes  a large  telescope  to  determine  their  rotational  periods. 


Figure  4 — The  internal  structure  of  Neptune: 

1 . Upper  atmosphere,  top  clouds 
2.  Atmosphere  consisting  of  hydrogen,  helium,  and  CH4  gas 
3.  Mantle  consisting  of  water,  ammonia,  and  CH4  ices 
4.  Core  consisting  of  rock  (silicates  and  nickel-iron) 

By  using  various  techniques  people  tried  to  determine  the  rotation 
periods  for  Uranus  and  Neptune.  A visual  technique  means  watching  the 
planet  as  it  spins.  This  is  rather  like  watching  the  Great  Red  Spot  on 
Jupiter  as  Jupiter  rotates:  once  around  is  a ‘day’  on  Jupiter.  Theory  means 
that  the  rotation  period  is  derived  from  planetary  theory  (using  the  mass 
and  shape  of  the  planet  to  derive  its  period),  not  by  using  a telescope. 
Photometry  means  that  a telescope  is  used  with  a filter  in  selected 
wavelength  bands  ( e.g .,  a color  like  infrared)  to  measure  changes  in  the 
light  from  the  planet.  A regular  and  repeatable  change  in  the  light  can 
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represent  the  length  of  the  planetary  day.  The  spectra  technique  means 
that  the  planet's  spectral  lines  are  used  to  determine  the  period.  This  last 
is  the  most  difficult  to  do  because  it  means  measuring  the  minute  tilt  of 
the  spectral  lines,  and  from  that  tilt,  the  rotational  period. 

Some  of  the  early  attempts  are  listed  in  Table  I and  Table  II. 

Table  I - early  values  for  Uranus 


Date  Period  Technique  Person 


1872 

12h 

Visual 

Buffam 

1900 

7h  < P <12h34m 

Theory 

Houzeau 

1902 

retrograde 

Spectra 

Deslandres 

1912 

10h45m 

Spectra 

Lowell,  Slipher 

1916 

10"49m 

Photometry 

Campbell 

1930 

10h50m 

Spectra 

Moore,  Menzel 

It  appears  that  people  were  closing  in  on  a rotational  period 
between  10h  and  1 lh,  and  this  value  showed  up  in  textbooks  of  the  time. 
Actually  the  notion  that  Uranus  has  a rapid  rotation  goes  back  to  Herschel 
who  thought  he  saw  a polar  flattening  of  the  planet. v"  Somewhat  later 
Laplace  provided  further  qualitative  support  for  HerscheTs  deduction  by 
noting  that  the  observed  co-planar  nature  of  the  satellite  orbits  implied 
that  Uranus  needed  a substantial  equatorial  bulge  to  counteract  the 
disruptive  perturbations  of  the  Sun.  However,  the  moon  Miranda  (smallest 
and  innermost)  is  on  an  orbit  substantially  inclined  to  the  common  plane 
of  the  remaining  satellites.  Curious.  The  first  substantial  datum  on 
Uranus’s  rotation  was  provided  by  Deslandres  in  1902,  who  detected  the 
tilt,  induced  by  rotation,  of  reflected  Fraunhofer  lines  in  the  planet's 
spectrum,  thus  proving  the  retrograde  sense  of  the  planet’s  spin. 

Table  II  - early  values  for  Neptune 


Date Period Technique Person 


1884 

7.92h 

Photometry 

Hall 

1896 

1 3 h < P < 18h 

Theory 

Tisserand 

1928 

15.8” 

Spectra 

Moore,  Menzel 

1955 

12.4311 

Photometry 

Gunther 

There  was  considerable  scatter  in  the  proposed  period  for  Neptune. 
It  is  farther  from  Earth  than  Uranus,  so  it  is  more  difficult  to  observe.  Even 
the  sense  of  its  spin  was  uncertain  (prograde  or  retrograde).  It  was  often 
supposed  that  the  planet  probably  rotated  in  a retrograde  sense.  Finally, 
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since  Moore  and  Menzel’s  work  on  Neptune  was  unique,  a re-examination 
of  the  sense  of  spin  was  worthwhile.  The  sense  of  the  spin  is  a crucial 
factor  in  understanding  the  evolution  of  Triton’s  retrograde  orbit. 

In  the  mid  1970’s  Mike  Belton  and  Sethanne  Howard  (Hayes), 
who  both  worked  at  Kitt  Peak  National  Observatory  (in  Tucson,  Arizona), 
decided  to  re-measure  the  rotational  periods  of  Uranus  and  Neptune.  The 
Voyager  mission  was  due  to  encounter  Uranus  in  1986  so  they  had  to  get 
their  data  before  Voyager  got  there  so  their  work  would  help  prepare  the 
Voyager  mission  for  the  Uranus  encounter.  They  wanted  to  determine  the 
rotation  period,  sense  of  spin,  and  orientation  of  the  spin  axis.  They 
decided  to  use  the  spectra  method  for  Uranus,  and  spectra  and  photometry 
methods  for  Neptune. 

This  is  their  story  of  how  this  was  done  in  the  days  before  the 
Internet,  thumb  drives,  and  laptops.  The  details  of  the  math  are  omitted 
for  simplicity. 

For  the  spectral  work  they  chose  reflected  Fraunhofer  lines.  These 
spectral  lines  come  from  sunlight  reflected  by  the  planet’s  atmosphere.  Of 
course,  planets  do  not  shine  on  their  own.  They  are  seen  by  reflected 
sunlight.  The  visible  and  near-infrared  spectra  of  Uranus  and  Neptune 
have  strong  Fraunhofer  absorption  lines  making  them  good  candidates  for 
the  tilted  spectral  line  approach. 

No  one  had  ever  observed  any  regular  variation  in  the  light  of 
Uranus  (Campbell’s  work  was  questioned)  so  they  did  not  use  the 
photometry  method  for  Uranus.  Actually  Howard  had  done  a small  project 
in  the  mid-1970s  where  several  images  of  Uranus  were  co-added  together 
to  increase  the  signal-to-noise.  The  result  was  a fairly  featureless  planet 
with  an  increase  in  contrast  in  one  hemisphere.  Not  believing  the  results, 
the  project  was  dropped.  That  was  perhaps  unfortunate  because  Voyager 
later  showed  the  same  feature. 

For  the  photometric  work,  they  already  knew  that  Neptune  showed 
variations  in  infrared  light  so  they  chose  that  spectral  waveband  for  the 
observations. 


Gathering  the  Observations 

Belton  received  observing  time  on  the  Kitt  Peak  4 meter  Mayall 
telescope  for  this  project.  It  was  unusual  to  get  4 meter  time  for  planetary 
work.  He  used  Kodak  Illa-J  plates  to  record  the  data/"1  Before  there  were 
digital  data,  there  were  glass  plates  with  a photographic  emulsion 
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embedded  on  them.  The  plate  was  baked  in  N2  for  several  hours  to 
increase  its  sensitivity.  After  cutting  the  plate  to  the  proper  size  (in  the 
dark),  one  exposed  the  plate  to  the  object  of  interest.  The  developed  plate 
looked  like  a negative,  lighter  where  the  spectral  lines  appeared  (Figure 

7). 

Astronomical  spectra  typically  look  like  a series  of  lines,  some 
wide  some  narrow.  In  this  case,  each  strip  of  a line  represents  a chemical 
element  or  molecule  in  the  atmosphere  of  the  planet.  Figure  5 shows  a 
standard  absorption  spectrum  spanning  blue  to  red.  The  lines  are  not  tilted. 
Figure  6 shows  a planetary  spectrum  with  a tilted  linelx. 

400  500  600  700 


I I I I 


wavtlcnqth  in  runometcri  (10  * m) 


Figure  5 — standard  absorption  spectrum 


Figure  6 — tilted  lines.  The  top  is  coming  towards  the  observer  (blue  shift),  the 
bottom  is  going  away  from  the  observer  (red  shift).  The  laboratory  line  (no  tilt)  is 
shown  in  white  in  front  of  the  tilted  line. 

Belton  oriented  the  slit  of  the  spectrograph  so  that  it  spanned  the 
planet  from  one  side  to  the  other.  He  took  a timed  exposure.  Then  he 
would  rotate  the  spectrograph  slit  by  a few  degrees  and  take  another 
spectrum.  He  was  finished  when  he  had  rotated  the  slit  all  the  way  around 
the  planet.  He  obtained  a set  of  nice  spectral  data  from  Uranus  and 
Neptune.  Some  spectra  are  shown  in  Figure  7 which  shows  three  position 
angles  (angle  of  the  slit  on  the  planet)  of  Uranus  and  Neptune  and  the 
lunar  spectrum  used  to  set  the  plate  scale, x dispersion, Xl  and  intrinsic  line 
tilt.  This  is  a developed  plate  (a  negative  of  the  original)  so  the  lines  are 
not  dark,  they  are  light  in  hue. 

Why  do  the  spectral  lines  tilt?  Uranus  is  rotating  about  its  axis.  At 
any  given  time  one  side  comes  toward  us,  the  other  away  from  us.  This 
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results  in  a classical  Doppler  shift  of  the  light  we  see.  Place  the 
spectrograph  slit  entirely  across  the  planet.  Then  one  end  of  the  slit  has 
light  receding  from  the  observer  (red  shift).  The  other  end  of  the  slit  has 
light  coming  towards  the  observer  (blue  shift).  A red  shift  will  shift  the 
position  of  the  spectral  line  to  the  right  just  a bit.  A blue  shift  will  shift  the 
position  of  the  spectral  line  to  the  left  just  a bit.  The  amount  of  shift  varies 
with  the  location  of  the  slit  on  the  planet.  Near  the  planet  center  there  will 
be  no  shift  at  all.  The  farther  from  the  center,  the  greater  the  shift,  hence 
a tilt  to  the  whole  line  as  it  covers  the  planet. 


MOON 


Figure  7 — spectra  of  Uranus,  Neptune,  and  the  Moon 

Note  that  the  widths  of  the  spectra  are  different  for  the  two  planets. 
That  is  because  as  seen  from  Earth  Neptune  is  smaller  than  Uranus. 

Figure  8 (upper  portion)  shows  a schematic  of  a planet  with  the 
spectrograph  slit  across  it  at  approximately  a 45°  angle  (this  is  called  the 
position  angle).  It  took  the  powerful  4 meter  telescope  to  do  this  because 
the  image  of  the  planet  had  to  be  large  enough  to  encompass  the  whole 
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slit.  Figure  8 (lower  portion)  shows  how  the  placement  of  the  slit  connects 
to  the  spectral  line. 

Belton  developed  the  glass  plates  and  handed  them  over  to 
Howard  for  reducing  (i.e.,  obtaining  the  data). 


Figure  8 — Image  of  Uranus  with  overlaid  slit  (top  image).  The  parallel  black  lines 
represent  the  slit.  Tilted  spectral  lines  (bottom  image)  are  shown  with  two  dotted  lines 
showing  where  the  light  from  the  planet  appears  on  the  spectral  line. 
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The  photometry  method  for  Neptune  was  handled  differently  from 
the  spectral  line  method.  The  Kitt  Peak  50"  telescope  had  an  infrared  filter, 
so  the  observer  would  see  infrared  light  and  not  much  else.x"  The  light 
from  Neptune  passed  through  the  telescope  and  filter  to  land  on  a 
photomultiplier  tube  - turning  photons  into  electrons.  From  there  the 
signal  would  appear  on  a Brown  chart  recorder  (an  antique  observing  tool) 
which  fed  a continuous  strip  of  paper  through  - similar  to  what  happens 
with  a lie  detector  test.  The  strip  of  paper  recorded  the  infrared  signal  from 
Neptune  for  as  long  as  one  could  observe  through  the  night.  If  the  signal 
never  changed  as  the  night  wore  on,  then  there  was  little  to  see  as  Neptune 
rotated.  But  if  the  signal  dropped  occasionally  and  in  a regular  manner 
then  the  time  between  drops  would  give  an  estimate  of  the  rotation  period. 
They  hoped  for  the  best  and  indeed  found  this  particular  signature  for 
Neptune. X1"  However,  the  data  were  rough  and  not  well  defined. 
Nevertheless,  they  agreed  fairly  well  with  the  spectroscopic  results. 
Figure  9 shows  a sample  of  the  Neptune  data.  Time  (in  terms  of  fractional 
periods)  is  plotted  along  the  horizontal  axis,  brightness  along  the  vertical 
axis.  Note  the  dip  in  the  middle  of  the  graph.  They  were  seeing  something 
(unknown)  that  caused  the  light  from  the  planet  to  decrease  in  a regular 
way. 


Figure  9 — infrared  photometry  of  Neptune 
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Data  Reduction 

Howard  began  reducing  the  spectral  data.  She  chose  to  measure 

O 0 

orders  46  (near  5000  A ) and  47  (near  4900  A ).  These  lines  were  near  the 
center  of  the  plate,  and  were  of  good  exposure  with  no  overlapping  orders. 
She  identified  individual  lines  using  the  lunar  spectrum  as  a reference  so 
that  she  used  only  Fraunhofer  lines.  Step  one  was  to  “digitize”  the  data 
with  an  automated  scanning  machine  called  a microdensitometer.  The 
exposed  glass  plate  was  placed  on  the  platen  and  automatically  moved 
step  by  step  to  measure  the  density  of  each  spot  on  the  plate.  She  used  a 
20  x 20pm  aperture  stepping  every  10pm  both  along  and  across  the 
spectra.  Each  order  was  digitized  in  overlapping  strips  20pm  wide  and 
10,240pm  long.  In  this  way  each  spot  on  the  glass  plate  was  turned  into  a 
number  stored  on  a 7-track  magnetic  tape.  The  microdensitometer  was 
controlled  by  a PDP  8 computing  machine  (way  back  there  in  early 
computers).  It  took  weeks  of  work  just  to  get  the  numbers  stored  on  a 7 
track  tape.xlv  Today  the  data  would  be  taken  with  a CCD  (Charge  Coupled 
Device)  chip  and  recorded  digitally  right  at  the  start. 

There  were  no  floppy  disks,  DVDs,  or  thumb  drives.  This  was  long 
before  the  days  of  the  laptop;  so  she  used  a Vax  minicomputer  for  the 
actual  data  processing.  Vaxen  were  very  nice,  robust  machines.xv  Kitt 
Peak  had  one  of  these  that  everyone  shared.  A Vax  can  read  a 7-track  tape. 
She  wrote  several  computer  programs  to  read  the  data  from  the  tape.  Each 
microdensitometer  scan  was  cross-correlated  with  a running  set  of 
weights  approximating  the  function  that  represented  the  slit.  The  position 
of  the  line  was  defined  as  the  zero-crossing  of  the  numerical  derivative  of 
the  cross-correlation.  The  tilt  angle  was  found  by  a least  squares  fit  of  a 
straight  line  through  the  zero-crossings.  In  other  words,  she  digitally 
reproduced  the  tilted  line.  The  results  for  two  lines  are  shown  in  Figure 
10.  The  left  side  shows  a good  fit.  The  right  side  shows  a poor  fit. 

As  a back-up  she  also  made  large  hard  copy  prints  of  the  plates 
(aka  Figure  7).  She  hand  measured  the  tilts  with  a compass,  protractor, 
and  ruler  as  a check  on  the  automated  procedure.  Interestingly  enough  the 
errors  in  the  hand  check  were  about  the  same  as  the  errors  with  the 
automated  procedure. 
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Figure  10  — two  spectral  line  fits 

In  the  case  of  Uranus  the  root  mean  squared  (rms)xvl  deviation  of 
the  line  tilt  angles  was  about  ±1.7°  and  the  standard  error  of  the  mean 
approximately  ±0.4°.  The  rms  deviation  for  Neptune  was  much  larger.  It 
was  about  ±3°  with  a standard  error  of  the  mean  about  ±0.9°. 

Knowing  the  measured  tilt  angle  is  not  enough.  The  line  tilt  must 
be  corrected  for  the  astronomical  seeing.  This  is  the  largest  source  of 
possible  error.  “Seeing”  is  a measure  of  the  quality  of  the  sky.  Is  the  image 
a pinpoint  or  sharp  (good  seeing)  or  is  it  smeared  out  (poor  seeing)? 
Celestial  objects  blur  and  twinkle  because  of  turbulent  mixing  in  the 
Earth’s  atmosphere.  Astronomers  always  hope  for  a clear  night  with  good 
seeing.  It  can  happen  that  a night  can  be  quite  clear  yet  unusable  because 
of  poor  quality  seeing. 

The  angular  size  of  Uranus  as  seen  from  Earth  is  a known  value. 
One  can  then  estimate  how  large  the  observed  planet  is  with  respect  to 
that  known  value.  This  is  a measure  of  the  seeing  - in  essence,  how 
“fuzzy”  is  Uranus.  The  same  method  is  used  for  Neptune.  Of  course 
“fuzzy”  can  be  a bit  subjective,  hence  the  possibility  of  error.  In  this  case 
the  value  of  the  ‘seeing’  is  the  full  width  at  half  intensity  of  the  Gaussian 
smoothing  function  required  to  explain  the  distribution  of  density  of  the 
plate  across  the  dispersion  (i.e.,  how  tall  is  the  spectrum).  In  other  words, 
use  a Gaussian  distribution  to  map  the  width  of  the  planet  (or  height  of  the 
spectrum). 

From  there  one  estimates  the  effective  seeing  corrections  by 
matching  the  cross-dispersion  distribution  of  intensity  on  each  plate  with 
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the  intensity  in  a model  spectrum  that  results  from  a convolution  of  a 
Gaussian  function  and  a model  planetary  limb  darkeningxv"  function  at  the 
spectroscopic  slit.  That  is  a lot  of  fancy  words  that  mean  match  the 
observed  profile  of  Uranus  with  the  known  profile  of  Uranus.  The 
difference  between  the  two  is  a measure  of  the  seeing.  The  estimate  of 
seeing  therefore  depends  on  the  radius  assumed  for  each  planet.  Howard 
and  Belton  assumed  Uranus  to  have  a radius  of  25,900  km  (actual  value 
25,362  km)  and  Neptune  to  have  a radius  of  24,500  km  (actual  value 
24,622  km). 

For  Uranus  this  correction  meant  the  line  tilt  needed  to  increase  by 
32%.  For  Neptune  it  was  a 202%  increase  (Neptune  was  small  and  fuzzy)! 
How  good  was  this  estimate  of  the  seeing?  The  uncertainty  in  the 
corrected  tilt  arising  from  estimating  the  seeing  was  about  ±8%  for  Uranus 
and  about  ±12%  for  Neptune. 

At  long  last  they  had  the  line  tilt,  6.  Onward  to  the  rotation. 

The  angular  velocityxvm,  co \ is  a measure  of  the  rotation  period,  co 
is  directly  related  to  the  line  tilt  and  is  independent  of  the  radius  of  the 
planet. 

There  are  a number  of  other  things  to  consider  when  getting  a 
rotation  rate,  co \ from  the  line  tilt.  Ultimately  the  relationship  between  the 
angular  velocity,  co.  of  a planet  near  opposition  and  the  tilt  of  the  spectral 
line,  tan  0.  in  reflected  light  is  (see  the  original  paper  for  derivation  of  this 
equation):XIX 

206265c  £>(/ t)tan(9 

(l  + cos^sd  X 

where  c is  the  velocity  of  light,  D(k ) the  plate  dispersion,  s the  plate  scale, 
cf>  the  planetary  phase  angle,  d the  distance  to  the  planet,  i/a \ the  position 
angle  of  the  slit,  ^poiethe  position  angle  of  the  pole,  and  f0  the  latitude  of 

the  observer  (us)  with  respect  to  the  planet’s  equator.  All  the  parameters 
on  the  right  hand  side  are  known.  Thus,  one  can  solve  for  the  parameters 
on  the  left  side:  position  angle  of  the  pole,  y/po ie,  the  rotation  period,  co, 
and  the  spin  direction  (clockwise  or  counterclockwise).  The  position  angle 
is  the  angle  measured  in  the  plane  of  the  sky  going  counterclockwise  from 
north.  It  is  a standard  tool  in  astronomy, 

One  can  see  that  the  actual  data  reduction  is  rather  complex.  After 
all  the  work  and  calculations  were  done  they  had  their  answers. 


(ycos/’@sin(^pole  - y/s) 
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They  found  that  the  spin  of  Uranus  is  retrograde;  the  spin  ot 
Neptune  is  prograde.  This  confirmed  early  spectroscopic  results. 

For  Uranus  the  direction  of  its  pole  points  a little  south  of  the 
orbital  plane,  thus  making  Uranus  truly  a sideways  planet.  The  position 
angle  of  the  pole,  projected  onto  the  plane  of  the  sky,  is  283°  ±4  ( i.e .,  13° 
south  of  the  equator).  As  it  revolves,  Uranus  rotates  like  a drunken 
astronomer  rolling  around  the  floor.  Near  the  time  of  Uranian  solstices, 
one  pole  continuously  faces  the  Sun,  and  the  other  one  faces  away.  Only 
a narrow  strip  around  the  equator  experiences  a rapid  day-night  cycle  with 
the  Sun  low  over  the  horizon  as  in  Earth’s  Polar  Regions.  Each  pole  gets 
about  42  years  of  continuous  sunlight,  followed  by  42  years  of  darkness. 
Near  the  time  of  the  equinoxes,  the  Sun  faces  the  equator  of  Uranus  giving 
a period  of  day-night  cycles  similar  to  those  seen  on  most  of  the  other 
planets. 

For  Neptune  the  pole  points  north  of  the  orbital  plane,  agreeing 
with  earlier  results.  The  value  for  Neptune  is  32°  ±11.  Remember  the 
Earth’s  pole  is  tilted  about  23.5°. 

The  rotation  periods  Howard  and  Belton  found  differed 
significantly  from  earlier  work.  For  Uranus  they  obtained:  24  ± 3 hr.  and 
for  Neptune:  22  ± 4 hr. 

As  an  additional  check  they  were  fortunate  to  obtain  from  Lowell 
Observatory  the  original  plates  taken  in  1912  by  Lowell  and  Slipher  for 
Uranus  and  Mars.  They  used  the  same  data  reduction  method  and  found 
significant  differences  between  their  Uranus/Mars  data  and  Lowell  and 
Slipher’ s values.  They  have  no  explanation  for  these  differences.  They 
also  used  spectra  of  Jupiter  reduced  the  same  way  and  found  agreement 
with  the  known  rotation  period. 

Now  done,  they  published  the  results.  The  work  made  a sizable 
splash  in  the  news,  even  showing  up  in  the  TVew  York  Times  and  Popular 
Mechanics. 

Naturally  one  rarely  abandons  a scientific  project.  A few  years 
(1980)  later  they  discovered  that  their  estimates  of  the  seeing  were  in 
error.  The  error  had  little  effect  on  Uranus,  but  had  a much  greater  effect 
on  Neptune  due  to  its  farther  distance  and  smaller  angular  size.xx  Basically 
they  learned  that  the  limb  darkening  of  Neptune  may  be  about  the  same 
as  Uranus.  They  had  assumed  that  the  two  planets  had  different  limb 
darkening  values.  Once  they  changed  this  parameter  the  corrected  value 
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for  Neptune's  rotation  period  became  15.4  ±3  hr.  If  the  limb  darkening  is 
less  for  Neptune,  then  the  rotation  period  is  lengthened. 


Conclusion 

What  did  Voyager  find  when  it  got  there?  For  Uranus  the  rotation 
period  is  17h  14m.  They  were  a bit  off  there.  For  Neptune  the  rotation 
period  is  16h  6m.  So,  oddly,  they  were  closer  for  the  more  distant  planet. 
The  axial  tilt  for  Uranus  is  97.77°  (about  8°  south  of  the  orbital  plane,  they 
got  13°).  The  axial  tilt  for  Neptune  is  28.32°  - not  too  far  from  their 
determination. 

All  in  all,  a real  visit  is  worth  the  price  of  admission. 

The  two  Voyager  missions  are  still  operating  as  they  move  through 
the  heliosheath  - the  place  where  the  interstellar  gas  meets  the  solar  wind. 
They  have  long  since  left  the  Solar  System  planets  behind.  Their  current 
locations  are  continually  updated  on  the  Voyager  website 
http://voyager.jpl.nasa.gov/.  Check  it  out.  In  2015  Voyager  had  its  38lh 
birthday,  and  it  is  the  longest  operating  of  any  NASA  satellite. 


Uranus  as  seen  by  Voyager 
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I Laplace  (1749  - 1827)  was  a French  mathematician  and  astronomer. 

II  English  astronomer/mathematician. 

III  This  effect  had  been  suggested  by  the  English  astronomer  Mary  Somerville  (1780  - 

1872). 

IV  French  mathematician. 
v German  astronomer. 

V1  Galileo  had  probably  observed  Neptune,  but  he  thought  it  was  a star. 
v"  Basically,  the  flatter  the  planet  the  faster  it  rotates. 

V1U  Astronomers  no  longer  use  glass  plates.  Today  everything  is  digital.  But  in  the  mid 
1970’s  glass  plates  were  still  common. 

Ix  The  spectral  lines  from  galaxies  also  tilt.  One  can  use  the  line  tilt  to  determine  the 
mass  of  the  galaxy. 

x The  plate  scale  can  be  described  as  the  number  of  degrees,  or  arcminutes  or 

arcseconds,  corresponding  to  a number  of  inches,  or  centimeters,  or  millimeters 
{etc.)  at  the  focal  plane  (where  an  image  of  an  object  is  “seen”). 

XI  The  dependence  of  refraction  on  the  wavelength  of  light  is  called  dispersion.  A lens 

or  prism  disperses  light. 

XII  Of  course,  no  filter  is  perfect.  They  had  to  correct  for  leaks  in  the  filter. 

X1U  When  Voyager  encountered  Neptune  it  saw  a “large  spot”  a storm  rather  like  the 
Great  Red  Spot  on  Jupiter.  They  must  have  been  observing  that  spot  as  it  rotated. 
x,v  People  do  not  use  7-track  tapes  for  data  storage  any  more. 
xv  Vaxen  are  almost  gone  now  too. 

XV1  . A measure  of  the  error. 

xvu  Limb  darkening  refers  to  the  diminishing  of  intensity  in  the  image  of  a star  or  planet 
as  one  moves  from  the  center  of  the  image  to  the  edge  or  “limb”  of  the  image. 
xvni  Angular  speed  at  which  the  planet  is  rotating. 

xlx  Howard-Hayes,  S.  and  Belton,  M.,  “The  Rotational  Periods  of  Uranus  and  Neptune”, 
Icarus,  32,  383-401  (1977). 

xx  Belton,  M.  J.  S,  Wallace,  L.,  Howard-Hayes,  S.,  and  Price,  M.  J.,  “Neptune’s 
Rotation  Period:  a Correction  and  a Speculation  on  the  Difference  between 
Photometric  and  Spectroscopic  Results”,  Icarus , 42,  71-78  (1980). 
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Editorial  Remarks 

Big  Data  is  a buzzword  we  hear  often  these  days.  Datasets  can  be  so  large 
or  so  complex  that  traditional  data  processing  applications  are  inadequate. 
We  shall  peek  into  a conversation  that  speaks  to  big  data  validity, 
credibility,  applicability,  and  its  broader  implications. 

The  Fall  issue  of  the  Journal  is  dedicated  to  a Symposium  entitled  “Big 
Data  Analytics  and  Workforce  Issues:  Initiatives,  Research,  and 
Challenges”  which  was  part  of  the  7lh  annual  Dupont  Summit  held  this  past 
December  in  Washington,  DC,  and  sponsored  by  the  Policy  Studies 
Organization.  The  objectives  of  the  Symposium  included  articulating 
critical  research  and  policy  questions  on  big  data  and  identifying  problems 
that  must  be  faced  to  answer  them.  Moreover,  the  presenters  discussed  the 
explosion  of  big  data  in  and  across  different  contexts  (academia,  industry, 
and  government  - the  “triple  helix”)  and  at  different  levels  of  analysis. 

The  organizer  of  the  Symposium  was  Connie  McNeely  from  George  Mason 
University.  The  moderator  was  Jong-on  Hahm  also  from  George  Mason 
University. 

Panelists  were  Philip  Bourne,  National  Institutes  of  Health;  Heng  Xu, 
National  Science  Foundation;  Erik  Kuiler,  Systems  Made  Simple,  Inc.;  Lisa 
Frehill,  Energetics  Technology  Center;  Michelle  Schwalbe,  National 
Research  Council;  and  Laurie  Schintler,  George  Mason  University. 

We  are  fortunate  that  Connie  McNeely  gathered  papers  from  the  panelists 
for  publication  in  the  Journal.  Her  paper  leads  the  discussion  and  introduces 
the  rest  of  the  contributors.  Enjoy  an  unusual  and  unique  look  at  Big  Data. 

Kirk  Borne  starts  us  off  describing  the  concept  of  Big  Data  and  providing 
an  introduction  to  the  papers. 


Sethanne  Howard 
Editor 
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Introduction 

Big  Data  Nation  - Foundations,  Applications,  and  Implications 

Kirk  Borne 

Principal  Data  Scientist,  Booz  Allen  Hamilton 

Data  is  the  new  oil,  the  new  natural  resource,  the  new  black,  the  new 
bacon,  and  the  new  gold  rush.  Such  statements  have  been  made  in  one  form 
or  another,  and  most  have  been  labeled  as  “big  data”  hype.  Nevertheless, 
despite  the  hype,  the  growth  in  data  is  unmistakably  a real  (and  really  big) 
phenomenon.  Fortunately,  the  growth  is  not  just  in  the  volume  of  our  data 
collections,  but  also  in  the  value,  opportunities,  and  insights  that 
organizations  can  now  achieve  through  the  exploration  and  exploitation  of 
their  massive  (and  growing)  data  assets.  The  papers  published  here  cover 
several  dimensions  of  this  data-driven  revolution  in  the  business  of 
everything:  business,  education,  research,  government,  finance,  healthcare, 
natural  resources,  our  personal  and  social  lives,  and  more.  In  the  paper  by 
Topi  and  Markus  (“Educating  Data  Scientists  in  the  Broader  Implications 
of  their  Work”),  the  authors  categorize  data  science  into  three  bodies  of 
knowledge:  Applications,  Infrastructure,  and  Implications.  If  we  re-label 
the  second  of  these  as  “Foundations”,  then  we  have  not  only  a useful 
mnemonic  (i.e.,  the  three  "-ations"),  but  we  also  have  a sensible 
categorization  of  the  papers  that  are  presented  here. 

The  foundations  upon  which  we  build  big  data  and  data  science 
applications  for  discovery,  insights,  and  innovation  include  basic  research 
and  engineering.  That  includes  academic  research  as  well  as  data 
engineering  for  infrastructure  research  and  development.  The  paper  “Social 
Media  Analysis  for  Higher  Education”  by  Berea,  Rand,  Wittmer,  and  Wall 
is  an  excellent  example  of  academic  research  on  mining  a particular  type  of 
data:  social  media  text  data.  The  authors  explore  students’  views  of  their 
higher  education  experience  through  a common  social  media  analytics 
technique,  sentiment  analysis.  The  paper  “Big  Data:  Who’s  Accountable?” 
by  Hahm  takes  us  through  three  case  studies  where  bias  and  mis- 
categorization  of  data  have  led  to  inaccurate  (and  sometimes  controversial) 
results  - the  importance  of  starting  with  a proper  foundation  in  data 
sampling,  data  integration,  and  interpretation  of  data  analytics  conclusions 
is  emphasized  throughout.  A related  paper  “Exploring  Bias  and  Error  in  Big 
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Data  Research”  by  Seely-Gant  and  Frehill  examines  research  ethics 
(foundations  and  implications)  of  sample  bias  and  erroneous  interpretations, 
which  are  increasingly  common  in  the  big  data  era  (especially  when 
working  with  social  media,  open  source  platforms,  and  online  user  data). 

The  applications  of  big  data  and  data  science  are  everywhere,  and 
there  are  papers  here  that  examine  those  cases.  The  paper  “Big  Data 
Adoption  in  the  Health  Care  Domain:  Challenges  and  Perspectives”  by 
Kuiler  looks  at  applications  in  healthcare  (improving  patient  care  and 
population  health),  while  also  addressing  foundational  issues  (workforce 
development)  and  implications  (data  anonymization  and  data  privacy 
confront  data  sharing).  In  the  paper  “Everything  Old  is  New  Again:  The  Big 
Data  Workforce”,  Frehill  looks  at  the  abundant,  novel,  and  game-changing 
applications  of  big  data  across  all  sectors  ( e.g .,  business,  health,  and 
finance).  The  author  then  explores  how  this  changes  workforce 
development,  education  programs,  consumer/customer  experience,  and  the 
landscape  of  data-driven  decision-making  in  organizations. 

Finally,  the  implications  of  what  we  are  doing  with  our  data 
collections  deserve  special  attention,  both  here  as  well  as  in  data  science 
and  analytics  education  programs.  One  of  those  areas  of  focus  in  academic 
programs  should  be  data  ethics.  There  are  very  few  examples  of  ethics 
courses  that  specifically  address  the  big  data  era  - one  of  those  is  the  “Data 
Ethics  in  an  Information  Society”  course  in  the  George  Mason  University 
Computational  and  Data  Sciences  degree  program.  Several  papers  here 
address  such  societal  and  workforce  implications.  The  aforementioned 
paper  by  Topi  and  Markus  specifically  examines  the  legal,  ethical,  and 
societal  implications  of  analytics  and  data  science,  specifically  in  the 
context  of  training  data  scientists  and  analysts  to  be  aware  of  and  diligent  in 
minimizing  the  possible  harmful  consequences  of  their  analytics 
applications.  The  paper  “Big  Data  Analytics  and  Workforce  Issues: 
Prospects  and  Challenges  in  the  Information  Society”  by  McNeely 
examines  both  applications  and  implications  of  big  data  analytics,  with  a 
central  focus  on  the  latter,  specifically:  challenges  across  technical,  social, 
political,  and  economic  dimensions.  In  the  insightful  paper  “Privacy  in  a 
Networked  World:  New  Challenges  and  Opportunities  for  Privacy 
Research”  by  Xu  and  Jia,  the  authors  investigate  new  concepts, 
consequences,  and  concerns  related  to  privacy  in  our  increasingly  digital 
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lives.  They  examine  human-data  interaction,  information  linkability, 
information  ephemerality,  and  information  identifiability.  All  of  this 
naturally  leads  to  concerns  about  information  liability,  something  which  all 
data  analytics  professionals  must  weigh  on  the  balance  sheet  of  information 
and  data  assets. 

As  data  emerges  from  the  era  of  big  data  hype,  contributing  value 
beyond  our  large  data  repositories,  and  blooms  into  a major  organizational 
asset  and  a major  organizational  product,  the  papers  presented  here  will 
provide  valuable  guidance,  insights,  and  perspectives  concerning  the 
foundations,  applications,  and  implications  of  data  analytics  in  a data- 
drenched  world. 


Bio 

Kirk  Borne,  PhD  is  an  astrophysicist,  Big  Data  science  consultant,  public 
speaker,  and  the  Principal  Data  Scientist  in  the  Strategic  Innovation 
Group  at  Booz  Allen  Hamilton.  He  previously  spent  12  years  as  tenured 
Professor  at  George  Mason  University  in  the  Computational  Science, 
Informatics,  and  Data  Science  programs.  Before  that,  he  worked  18  years 
on  various  NASA  contracts,  as  research  scientist,  as  a manager  on  a large 
science  data  systems  contract,  and  as  the  Hubble  Telescope  Data  Archive 
Project  Scientist.  He  also  actively  promotes  data  literacy  by  disseminating 
information  related  to  data  science  and  analytics  on  social  media,  where  he 
has  been  named  consistently  since  2013  among  the  top  worldwide 
influences  in  big  data  and  data  science. 
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Big  Data  Analytics  and  Workforce  Issues: 
Prospects  and  Challenges  in  the  Information  Society 


Connie  L.  McNeely 

George  Mason  University 

Abstract 

Big  data  is  one  of  the  most  critical  features  marking  and  defining  our  world 
today.  It  constitutes  an  analytical  space  encompassing  processes  and 
technologies  that  can  be  applied  across  a wide  range  of  domains  in  the  current 
and  growing  information  society.  The  articles  presented  in  this  issue  address 
related  challenges  and  prospects  as  crucial  considerations  in  technical,  social, 
political,  and  economic  power  and  relations  in  national  and  international 
contexts.  With  particular  attention  to  conceptual  delineations,  analytical 
applications,  and  educational  and  workforce  dynamics,  they  attend  to  both 
instrumental  and  intrinsic  aspects  of  big  data  relative  to  society  in  general. 
Together,  the  articles  constitute  a conversation  that  speaks  to  big  data  validity, 
credibility,  applicability,  and  broader  societal  implications  — both  positive  and 
negative  — today  and  in  the  future. 

Introduction 

Big  data,  in  all  of  its  manifestations  and  applications,  is  the  beating  heart 
of  today’s  burgeoning  information  society.  On  the  one  hand,  big  data  has 
been  acclaimed  in  line  with  promises  for  societal  benefits.  However,  on  the 
other  hand,  big  data  also  has  sparked  controversies  and  debates  on  the 
challenges  and  vulnerabilities  that  it  has  created  relative  to  social,  political, 
and  economic  power  and  relations.  Whether  addressed  in  terms  of  technical, 
social,  or  organizational  perspectives,  relevant  topics  are  in  the  forefront  of 
initiatives  in  academia,  government,  and  industry.  Moreover,  the  advent  of 
big  data  has  raised  questions  about  those  who  use  it  and  those  who  work 
with  it,  especially  in  light  of  socio-cultural  and  structural  dynamics  and 
disparities  in  terms  of  educational  and  workforce  dynamics.  Accordingly, 
the  objectives  of  the  articles  in  this  collection  include  articulating  critical 
research  and  policy  questions  and  identifying  challenges  to  answering  them 
and  to  engaging  big  data  effectively.  Employing  perspectives  that  address 
both  instrumental  and  intrinsic  aspects  of  big  data,  they  offer  an  effective 
and  comprehensive  view  on  the  prospects  and  challenges  of  big  data 
analytics  and  workforce  issues  in  and  across  different  contexts  and  at 
different  levels  of  analysis.  Together,  the  articles  constitute  a conversation 
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that  speaks  to  big  data  validity,  credibility,  applicability,  and  broader 
societal  implications  now  and  in  the  future. 

New  opportunities  and  prospects,  but  also  new  challenges, 
controversies,  and  vulnerabilities,  have  marked  the  explosion  of  big  data  as 
a phenomenon  in  and  of  itself.  In  2000,  only  a quarter  of  all  stored 
information  was  digital;  by  2013,  more  than  98  percent  of  the  world’s  stored 
information  was  digital  (Mayer-Schonberger  and  Cukier  2013).  Indeed, 
“the  world  contains  an  unimaginably  vast  amount  of  digital  information 
which  is  getting  ever  vaster  ever  more  rapidly.  This  makes  it  possible  to  do 
many  things  that  previously  could  not  be  done:  spot  business  trends,  prevent 
diseases,  combat  crime,  and  so  on.  Managed  well,  the  data  can  be  used  to 
unlock  new  sources  of  economic  value,  provide  fresh  insights  into  science, 
and  hold  governments  to  account”  (Economist  2010)  — but  they  also  create 
a host  of  new  problems,  with  misuse  and  misinformation,  security  concerns, 
privacy  violations,  etc.  at  the  top  of  many  related  policy  agendas.  The  ever- 
increasing  body  of  data  is  a core  operational  feature  in  virtually  every  sector 
of  society,  and  how  we  understand  and  use  big  data  is  increasingly  the 
defining  feature  of  our  times. 

Conceptual  Dimensions 

Big  data  is  a multidimensional  concept  referring  to  the  exponential 
growth  and  availability  of  both  structured  and  unstructured  data  (SAS 
2013),  embracing  technology,  decision  making,  and  policy.  Big  data  has 
largely  been  interpreted  in  terms  of  the  “3  Vs”:  volume,  velocity,  and 
variety.  That  is,  "big  data  is  high  volume,  high  velocity,  and/or  high  variety 
information  assets  that  require  new  forms  of  processing  to  enable  enhanced 
decision  making,  insight  discovery,  and  process  optimization"  (Beyer  and 
Laney  2012;  Laney  2001).  Volume  indicates  the  increasing  amount  of  data, 
velocity  indicates  the  speed  of  data,  especially  the  rate  at  which  it  is  created 
or  becomes  available,  and  variety  indicates  the  range  of  data  types  and 
sources  (Laney  2001).  The  compilation  of  large  complex  datasets  has  made 
for  massive  volumes  of  data  characterized  by  variety  that  reflect  the 
different  types  of  structured  and  unstructured  data  that  are  collected; 
velocity  refers  to  how  quickly  these  data  can  be  made  available  for  analysis 
(UA  2015).  Together,  these  dimensions  comprise  a basic  model  for 
describing  big  data. 
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However,  other  “Vs”  also  have  been  included,  especially  variability 
and  veracity , such  that  reference  to  the  “5  Vs”  has  become  common.  Along 
with  the  variety  and  complexity  that  mark  big  data,  variability  is  reflected 
in  inconsistencies  in  data  flows  (SAS  2013).  The  veracity  of  the  data 
represents  an  especially  critical  issue.  Veracity  is  an  indication  of  data 
integrity  and  the  extent  to  which  it  can  be  trusted  for  analytical  and  decision- 
making purposes  (UA  2015).  Methods  for  data  verification  and  validation, 
as  specifically  applied  to  big  data,  are  of  particular  importance  in  this 
regard.  In  addition,  another  “V”  — value  — is  sometimes  discussed  as  a 
separate  dimension  of  big  data,  highlighting  the  value-added  capacity  of  big 
data  (IDC  2012). 

In  any  case,  while  there  is  a lack  of  consistent  definition,  the  term  “big 
data”  has  reached  some  general  agreement  among  various  stakeholders  as 
constituting  at  least  some  indication  of  volume,  signaling  the  size  of  datasets 
as  the  critical  factor  (Ward  and  Barker  2013).  After  all,  the  allusion  is  to 
“big”  data  in  relative  terms.  Big  data  is  derived  from  various  sources,  in 
particular  streaming  data  as  the  Internet  of  Things,  social  media  data,  and 
publicly  available  open  data.  The  conversion  of  large  collections  of 
documents  from  print  to  digital  format  is  giving  rise  to  massive  archives  of 
unstructured  data,  and  social  media,  crowdsourcing  platforms,  and  various 
applications  are  producing  reams  of  information  from  the  real-time 
transactions  of  people  around  the  world.  The  complex  structure,  behavior, 
and  permutations  of  datasets  are  a fundamental  consideration  in  describing 
data  as  big  (Ward  and  Barker  2013).  However,  having  said  that,  “big  data 
is  less  about  data  that  is  big  than  it  is  about  a capacity  to  search,  aggregate, 
and  cross-reference  large  data  sets”  (boyd  and  Crawford  2012,  p.  663). 
Underlying  the  concept  of  big  data  are  the  technologies  — the  tools  and 
techniques — that  are  used  to  process  massive  or  complex  datasets.  Hence, 
we  can  refer  to  big  data  as  a term  describing  the  storage  and  analysis  of 
large  and/or  complex  datasets  using  a series  of  applicable  techniques  (Ward 
and  Barker  2013). 

From  an  expanded  theoretical  and  practical  perspective,  big  data  also 
has  been  described  as  a cultural,  technological,  and  scholarly  phenomenon, 
resting  on  the  interplay  of  technology,  analysis,  and  mythology  (boyd  and 
Crawford,  p.  663): 
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1)  Technology : maximizing  computation  power  and  algorithmic 
accuracy  to  gather,  analyze,  link,  and  compare  large  data  sets. 

2)  Analysis : drawing  on  large  data  sets  to  identify  patterns  to  make 
economic,  social,  technical,  and  legal  claims. 

3)  Mythology : the  widespread  belief  that  large  data  sets  offer  a 
higher  form  of  intelligence  and  knowledge  that  can  generate 
insights  that  were  previously  impossible,  with  the  aura  of  truth, 
objectivity,  and  accuracy. 

In  more  encapsulated  terms,  big  data  reflects  “a  point  of  view,  or 
philosophy,  about  how  decisions  will  be  — and  perhaps  should  be  — made 
in  the  future”  (Lohr  2013). 

Data-to-Knowledge-to-Action  Analytics 

“The  challenge  of  big  data  is  to  convert  it  into  useable  information  by 
identifying  patterns  and  deviations  from  those  patterns”  (UA  2015).  In 
epistemological  terms,  information  is  comprised  by  a collection  of  data,  and 
knowledge  is  established  through  different  strands  of  information 
(. Economist  2010),  leading  to  questions  that  speak  to  the  process  of 
converting  data  to  knowledge  to  action.  For  example,  what  are  analytical 
and  policy  implications  of  the  data  in  light  of  the  how  and  why  they  are 
collected,  categorized,  and  aggregated?  Do  such  data  tasks  reflect  on  how 
they  are  or  should  be  used?  Furthermore,  as  elsewhere  queried  by  McNeely 
and  Hahm  (2014),  will  the  analysis  of  big  data  provide  insights  and 
information  that  will  allow  the  development  of  answers  to  big  questions,  or 
will  it  simply  provide  larger  scale  versions  of  answers  already  attained  with 
smaller  data?  Frankly,  actual  understanding  is  not  stressed  in  most  big  data 
approaches;  correlations  are  the  rule,  representing  a move  away  from 
actually  understanding  phenomena  to  simply  indicating  associations 
(Mayer-Schonberger  and  Cukier  2013). 

The  challenge  of  turning  data  into  knowledge  reflects  matters  of  data 
interpretation  and  re-purposing  relative  to  secondary  data  markets 
(Washington  2014).  In  practice  big  data  might  yield  information,  but  not 
necessarily  understanding.  Keep  in  mind  that  data  gain  meaning  only  in 
context.  What  critical  or  fundamental  factors  must  be  considered  for  true 
understanding?  The  socio-technical  limitations  of  big  data  rest  on 
considerations  of  context  and  meaning  and,  as  such,  big  data  must  be 
engaged  with  an  appreciation  of  both  its  power  and  its  limitations.  More  to 
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the  point,  while  big  data  is  increasing,  the  ability  to  translate  it  into 
knowledge  and,  more,  to  extract  wisdom  from  it  is  relatively  rare  (McNeely 
and  Hahm  2014,  p.  307;  Economist  2010). 

Large  datasets  have  long  been  around  and  in  use  in  various  fields. 
However,  the  big  data  revolution  invokes  a different  frame  for  engaging 
them.  The  integration  of  data  from  various  sources  and  the  use  of  that  data 
for  purposes  beyond  those  for  which  it  was  originally  collected  or  created 
are  principal  tasks  associated  with  big  data  use  (Berman  2013).  Moreover, 
at  this  point,  it  appears  certain  that  data  will  continue  to  get  “bigger  and 
bigger.”  The  Internet  of  Things  is  expected  to  comprise  tens  of  billions  of 
objects  by  the  end  of  this  decade  and  is  actively  and  instantaneously  sensing 
data  on  virtually  every  aspect  of  our  lives  and  environment.  Noting  this 
trend,  Kuiler  (p.  11)  looks  to  the  volumes  of  clinical,  financial,  and 
consumer  information  available  to  healthcare  organizations.  Mapping  a 
complex  multi-disciplinary  approach  to  big  data  analytics,  he  focuses  on 
questions  related  to  health  and  bioinformatics.  In  application,  he  categorizes 
and  reviews  a wide  range  of  structured  and  unstructured  data  and  offers  an 
imiovative  approach  to  performance  measurement  in  the  healthcare  domain. 
Overall,  he  provides  evidence  on  the  use  of  big  data  analytics  for  reducing 
operational  costs  and  optimizing  performance,  for  improving  regulatory 
compliance,  and  for  increasing  returns  on  investments,  while  also 
delineating  future  trends  in  big  data  analytics.  However,  he  also  explores 
challenges  and  barriers  to  big  data  analytics  and  use,  discussing  limitations 
and  difficulties  incurred  in,  for  example,  industry  refusals  to  share  data, 
institutional  barriers,  and  information  governance.  Framing  big  data  as  a 
trope  for  a number  of  different  technological  and  institutional  factors,  he 
points  to  the  problems  of  an  abundance  of  data  relative  to  a scarcity  of 
information,  noting  that  more  data  is  not  always  better. 

Approaching  such  issues  from  a different  direction,  practical  questions 
of  data  veracity  also  are  fundamental  for  converting  data  to  knowledge  to 
action.  Problems  of  sampling  bias  are  particularly  relevant  in  this  regard. 
Sampling  bias  is  inherent  in  many  big  datasets.  How  might  that  affect  policy 
development  and  implementation?  Seely-Gant  and  Frehill  (p.  29)  examine 
related  complications  along  these  lines,  discussing  how  sampling  issues, 
especially  selection  bias,  associated  with  big  data  sources  can  have  far 
reaching  implications  for  analysis  and  interpretation.  Furthermore,  in  the 
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same  vein,  Hahm  (p.  23)  offers  a commentary  on  accountability  and  data 
veracity,  pointing  out  how  sampling  and  sorting  bias  and  errant 
categorizations  can  lead  to  inaccurate  conclusions,  which  can  be 
particularly  dangerous  for  informing  policy  decisions. 

While  also  encompassing  these  problems,  one  of  the  most  prominent 
and  controversial  issues  that  arises  in  discussions  of  big  data  is  privacy.  Xu 
and  Jia  (p.  73)  probe  this  topic,  examining  changing  conceptions  of  privacy 
in  today’s  big  data  environment  in  terms  of  information  identifiability, 
ephemerality,  and  linkability.  They  apply  this  conceptual  approach  to 
investigate  threats  to  information  privacy  in  light  of  the  collection  and 
analysis  of  large-scale  data  from  social  networking  sites.  Focusing  on 
human-data  interaction,  they  turn  to  problems  and  risks  associated  with,  for 
example,  data  de-identification  and  re-identification,  data  integration,  and 
legal  obligations  and  developments  with  regard  to  privacy  issues.  Their 
primary  emphasis  is  on  mapping  privacy  regulations  into  actionable 
information  technology  requirements  that  are  re-usable  across  systems. 

Education  and  Workforce  Dynamics 

Big  data  engagement  and  related  topics  are  relevant  within  and  across 
sectors  and  require  examination  from  technical,  social,  and  organizational 
perspectives.  The  skills,  training,  and  education  necessary  for  big  data 
related  jobs  in  industry,  government,  and  academia  have  become  a focus  of 
discussions  on  educational  attainment  relative  to  workforce  trajectories. 
Especially  given  assertions  of  a skills  gap  for  manipulating,  analyzing,  and 
understanding  big  data,  the  relationship  between  education  and  the 
development  of  the  big  data  workforce  is  a critical  point  of  departure  for 
delineating  the  field  in  general.  Further,  the  role  of  big  data  in  affecting 
social,  political,  and  economic  relations  and  power  come  into  play  as 
reflected  in  questions  of  educational  and  workforce  opportunity  and  access, 
and  also  raising  questions  of  the  “digital  divide.”  Do  gatekeepers  come  into 
play  with  big  data,  as  in  other  fields,  precluding  certain  individuals  or 
groups  from  accessing  data  or  participating  in  relevant  fields?  In  general, 
basic  questions  on  building  proficiency  in  big  data  and  workforce 
development  are  at  the  forefront  of  debates  in  different  sectors  (NRC  2014). 
That  is,  what  should  be  taught,  by  whom,  to  whom,  and  how? 
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Technically  speaking,  training  and  education  for  big  data  jobs  typically 
require  a basic  knowledge  of  statistics,  quantitative  methods,  or 
programming,  upon  which  applicable  skillsets  can  be  built.  Such 
background  can  be  acquired  in  a number  of  fields  that  have  long 
incorporated  related  preparation.  For  example,  Frehill  (p.  49)  notes  that 
“social  scientists  have  worked  with  exceptionally  large  data  sets  for  quite 
some  time,  historically  accessing  remote  space,  writing  code,  analyzing 
data,  and  then  telling  stories  about  human  social  behavior  from  these 
complex  sources.”  However,  she  differentiates  between  traditional  large 
“designed”  datasets  and  the  new  “organic”  big  data  that  are  calling  for  more 
and  more  trained  knowledge  workers  with  the  required  “deep  skills  and 
talent.”  Examining  the  role  between  higher  education  and  the  development 
of  the  big  data  workforce,  she  addresses  basic  questions  about  participation 
and  also  considers  key  lessons  regarding  gender  differences  in  the  big  data 
workforce. 

Overarching  changes  in  occupational  roles  and  practices  in  the  face  of 
technological  shifts  have  led  to  revised  workforce  expectations  and  needs. 
Some  estimates  suggest  a shortage  by  2018  of  some  190,000  data  scientists 
in  the  United  States,  in  addition  to  1.5  million  analysts  and  managers  with 
knowledge  and  skills  to  use  analyses  of  big  data  to  make  effective  decisions 
(GovLab  2013;  MGI  2011).  Frankly,  when  an  industry  or  field  is  growing 
rapidly,  “it  is  not  unusual  for  a shortage  of  workers  to  occur  until 
educational  institutions  and  training  organizations  build  the  capacity  to 
teach  more  individuals,  and  more  people  are  attracted  to  the  needed 
occupations”  (CEA  2014,  p.  41).  Thus,  Topi  and  Markus  (p.  39)  investigate 
the  growing  number  of  analytics  and  data  science  programs,  arguing  for  the 
need  to  include  an  emphasis  on  the  implications  and  consequences  of 
practices  and  applications  in  related  fields.  They  note  the  need  for  big  data 
workers  “who  are  sensitive  to  data  downsides  as  well  as  upsides”  to  achieve 
the  benefits  of  big  data  while  avoiding  harmful  consequences.  From  yet  a 
different  perspective,  the  investigation  presented  by  Berea,  Rand,  Wittmer, 
and  Wall  (p.  63)  rests  on  social  media  analysis,  using  big  data  itself  in  their 
research  on  big  data  analytics  within  education  and  related  policies  and 
reflecting  the  changing  data  landscape. 
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Conclusion 

The  effects  of  big  data  are  being  felt  everywhere.  As  an  analytical 
space,  big  data  encompasses  processes  and  technologies  that  can  be  applied 
across  a wide  range  of  domains  “from  business  to  science,  from  government 
to  the  arts”  ( Economist  2010),  with  positive  and  negative  implications 
depending  on  perspective  and  application.  As  such, 

Big  Data  triggers  both  utopian  and  dystopian  rhetoric.  On  one  hand, 
Big  Data  is  seen  as  a powerful  tool  to  address  various  societal  ills, 
offering  the  potential  of  new  insights  into  areas  as  diverse  as  cancer 
research,  terrorism,  and  climate  change.  On  the  other,  Big  Data  is  seen 
as  a troubling  manifestation  of  Big  Brother,  enabling  invasions  of 
privacy,  decreased  civil  freedoms,  and  increased  state  and  corporate 
control.  As  with  all  socio-technical  phenomena,  the  currents  of  hope 
and  fear  often  obscure  the  more  nuanced  and  subtle  shifts  that  are 
underway,  (boyd  and  Crawford  2012,  pp.  663-664) 

Big  data  raises  new  issues  and  concerns  related  to,  for  example,  privacy, 
liability,  security,  and  access,  and  has  been  invoked  relative  to  new  ways  of 
thinking  about  the  world  and  relations  across  contexts.  It  has  led  to  new 
possibilities  and  prospects  for  research  and  policy,  with  fundamental  issues 
turning  on  cultural,  organizational,  and  technological  capacities  at  the  heart 
of  debates  and  practices  within  and  across  academia,  industry,  and 
government.  Attending  to  issues  of  research  and  knowledge  production,  of 
education  and  workforce  dynamics,  of  socio-cultural,  political,  and 
economic  relations,  the  articles  presented  in  this  issue  interrogate  and 
examine  critical  related  issues  from  various  perspectives,  addressing 
challenges  and  prospects  for  big  data  in  theory  and  application  in  the 
growing  information  society. 
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Big  Data  Adoption  in  the  Health  Care  Domain:  Challenges 

and  Perspectives 


Erik  W.  Kuiler 

George  Mason  University 

Abstract 

Due  to  recent  technological  advancements,  health  care  organizations  now  have 
access  to  large  volumes  of  clinical,  financial,  and  consumer  information  from 
which  to  identify  patterns  and  trends.  As  with  other  industries,  health  care  is 
grappling  with  the  best  ways  to  decipher  and  leverage  these  big  data  sets,  with  the 
ultimate  goals  to  enhance  patient  care  and  improve  population  health.  The  sheer 
magnitude  of  the  number  of  available  data  is  both  a boon  and  a hurdle.  When 
interpreting  data,  more  information  is  not  always  better,  unless  an  organization 
assesses  these  data  to  discern  what  are  noise  and  what  are  not.  This  paper  explores 
a number  of  challenges  and  barriers  to  big  data  analytics  and  use. 

Introduction 

The  healthcare  domain  has  witnessed  a rapid  growth  in  the  delivery  of 
data-driven  medicine  resulting  from  the  introduction  of,  for  example, 
electronic  health  records,  digital  imaging,  digitized  procedures,  increasing 
sophistication  in  lab  test  formulation,  the  real-time  availability  of  sensor 
data,  and,  what  stands  out  in  the  popular  press,  the  introduction  of 
genomics-related  projects  (Ohno-Machado  2012;  Shah  and  Tenenbaum 
2012).  Information  technology  (IT)  advances  have  led  to  a discourse  on  the 
applicability  of  big  data  (a  term  coined  by  the  Gartner  Group,  an  IT  industry 
market  research  organization)  to  health  data  analytics  (for  example,  Sahoo 
et  al.  2013).  This  study  summarizes  a presentation  made  at  the  2014  Dupont 
Summit,  held  in  Washington  DC,  and  explores  topics  considered  in  that 
discussion.  The  paper  concludes  with  a preliminary  assessment  of  future 
trends. 


Adopting  the  Gartner  Group’s  definition,  trade  journals  tend  to 
emphasize  three  big  data  properties,  collectively  referenced  as  the  three  V’s: 
volume  - to  denote  an  exponentially  large  data  set,  ranging  in  size  from  one 
or  more  terabytes  ( 1 0 1 2)  to  multiple  petabytes  ( 1 0 1 5)  or  exabytes  ( 1 0 1 8); 
velocity  - to  indicate  data  that  arrive  as  continuous  streams,  rather  than  as 
transaction  or  database  files;  and  variety-  to  designate  data  sets  that  contain 
both  structured  and  unstructured  data  that  may  be  subject  to  different 
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semantics  and  in  different  formats,  gathered  from  diverse  sources.1 
Discussing  big  data  in  its  historical  perspective,  Jacobs  (2009)  offers  a 
definition  of  the  term  that  is  perhaps  more  useful  because  it  places  big  data 
in  its  proper  IT  context:  “Big  data  should  be  defined  at  any  point  in  time  as 
"data  whose  size  forces  us  to  look  beyond  tried-and-true  methods  [of  storage 
and  manipulation]  that  are  prevalent  at  that  time.,,,  From  Jacobs’  point  of 
view,  in  the  1960’s  data  files  that  could  not  be  managed  effectively  with  a 
single  tape  mount  could  be  considered  as  the  big  data  of  that  era.  Currently, 
the  capabilities  to  ingest,  analyse,  and  manage  multi-petabyte  data  sets  have 
underscored  the  limitations  of  our  data  analytics  capabilities  supported  by 
Relational  DataBase  Management  Systems.  These  data  management 
limitations  have  led  to  the  introduction  of  specific  IT  applications  that 
address  the  volume  and  velocity  requirements  of  big  data,  much  as  in  the 
1960-80’s  the  availability  of  multi-tape  data  sets  led  to  the  introduction  of 
mechanical  “tape  monkeys”  ( Jacobs’  term)  to  swap  tapes  in  and  out. 

Because  of  its  use  in  the  popular  press,  the  term  big  data  has  become 
a trope  for  a number  of  different  technologies  and  institutional  conflicts 
between  the  rights  of  the  states  and  the  rights  of  the  citizenry:  cloud-based 
data  and  information  analytics  and  big  data  management  systems,  data 
interoperability  as  well  as  NS  A spying,  insurance  denials  based  on  big  data- 
based  trend  analyses,  and  security  lapses  that  may  lead  to  data  breeches  and 
the  loss  of  personally  identifiable  information  (PII)  - any  data  that, 
collectively  or  severally,  may  potentially  identify  a specific  individual 
human  being. 

The  scope  of  the  healthcare  domain  is  extensive,  comprising  the 
activities  of  diverse  epistemic  communities,  each  of  which  has  its  own 
institutional  paradigms  and  cultural  imperatives,  resulting  in  a contested 
equilibrium  (adapting  Amartya  Sen’s  phrase  1982,  1999)  between  different 
interest  groups:  clinical  health,  focused  on  the  delivery  of  patient-centered 
healthcare  services;  public  health,  including  clinical  case  surveillance, 
syndromic  surveillance,  prevention,  preparedness,  and  health  promotion  in 
a community;  population  health,  focused  on  health  outcomes  of  a group  of 
individuals,  including  the  distribution  of  such  outcomes,  in  a population; 
environmental  health,  focused  on  physical,  chemical,  and  biological  factors 
external  to  a person,  and  all  the  related  factors  that  may  have  an  impact  on 
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individual  behavior;  and,  since  the  1990’s,  genomics  - genes,  genomes, 
proteins,  cells,  ecological  systems. 

Impetus  to  Big  Data  Adoption  in  the  US 

In  the  United  States,  the  impetus  for  big  data-based  health 
informatics  came  during  2008-2010.  Under  the  Health  Information 
Technology  (HIT)  for  Economic  and  Clinical  Health  (HITECH)  component 
of  the  American  Recovery  and  Reinvestment  Act  of  2009,  the  Centers  for 
Medicare  and  Medicaid  reimburse  health  service  providers  for  using 
electronic  documents  in  formats  certified  to  comply  with  HITECH’s 
Meaningful  Use  (MU)  standards.  The  Patient  Protection  and  Affordable 
Care  Act  of  2010  (ACA)  promotes  access  to  health  care  and  greater  use  of 
electronically  transmitted  documentation.  Health  informatics  are  expected 
to  provide  a framework  for  the  electronic  exchange  of  health  information 
that  complies  with  all  legal  requirements  and  standards  and,  consequently, 
expands  the  delivery  of  comparative  effective-  and  evidence-based 
medicine.  HITECH  MU  and  ACA  support  the  adoption  of  Electronic  Health 
Records  (EHR)  as  the  preferred  method  for  data  interoperability  among 
patients,  healthcare  providers,  and  healthcare  payers  (HHS  2015). 

Benefits  of  Big  Data  Adoption 

Today,  the  availability  of  large  data  sets  is  the  norm  rather  than  the 
exception.  With  the  adoption  of  data  analytics,  end-users  have  become 
increasingly  more  data  literate,  so  that,  while  a simple  spreadsheet  would 
have  sufficed  earlier,  now  end-users  expect  to  see  more  complex  models, 
such  as  the  results  of  time-series  probability-based  analyses  to  complement 
snap-shot  descriptive  statistics. 

The  adoption  of  big  data  acquisition,  management,  and  analytics 
provides  a number  of  important  benefits:  large  sample  size  - the  larger  the 
size,  the  greater  the  probability  that  the  sample  will  accurately  reflect  the 
characteristics  of  the  population;  increased  predictive  power  - studies  based 
on  big  data  samples  are  more  likely  to  give  statistically  significant  results; 
a strong  foundation  for  puiposeful  action  - cluster  and  category  analytics  of 
very  large  data  samples  support  the  development  of  treatments  and 
protocols  that  are  more  accurately  tailored  to  the  specific  needs  of  patient 
populations  (or  cohorts). 
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In  addition  big  data  analytics  support  proactive  wellness  and  disease 
management  by  discovering  patterns;  for  example,  snap-shots  of  current 
operations,  likely  future  trends,  metrics  of  program  efficacy  and  efficiency, 
prospective  needs  of  a population;  decision  support  data  to  chart  future 
directions,  data  to  support  knowledge-intensive  problem  definition  and 
resolution  (diagnostics,  research,  policy  analysis,  etc.).  Also,  big  data 
analytics  enable  healthcare  improvements  by,  for  example,  integrating 
clinical  and  claims  data  so  that  they  are  accessible,  searchable,  and 
reportable;  aggregating  data  from  patient  encounters  to  support  public  and 
population  health  management;  identifying  and  targeting  individual  patients 
and  cohorts  for  outreach;  assessing  quality  of  care  across  provider  networks; 
and  correlating  clinical  and  financial  risk  measures  to  optimize  health  care 
delivery.  Additionally,  big  data  provide  answers  to  important  questions, 
such  as:  How  effective  is  a particular  program,  in  terms  of  access  and 
results;  is  a client  population  served  as  well  as  it  could  be?  Looking  at 
specific  parameters,  what  policy  changes  should  be  enacted  to  make  the 
program  better?  Should  more  resources  be  allocated  and  of  what  kind, 
when,  and  where?  What  is  the  likelihood  of  a patient  suffering  a stroke, 
given  his  or  her  lifestyle,  and,  based  on  these  probabilities,  what  kind  of 
ameliorative  regimen  should  be  proposed?  What  is  the  likelihood  of  a 
provider  committing  fraud,  given  certain  characteristics,  and,  as  a corollary, 
given  the  historical  pattern  of  this  provider’s  behavior  compared  to  that  of 
other  providers,  is  this  one  committing  fraud?  How  can  a fraud  detection 
program  be  improved  by  operationalizing  the  analytics  model? 

Barriers  to  Big  Data  Adoption 

While  big  data  analytics  can  improve  the  delivery  and  quality  of 
healthcare,  there  are  institutional  barriers  to  their  adoption,  including 
adversarial  relationships  between  healthcare  practitioners  and  HIT  vendors, 
lack  of  government  incentives,  economic  limitations,  and  ethical  and  moral 
constraints  centered  on  data  ownership,  stewardship,  and  human  rights. 

Information  Governance  and  Management  Challenges 

Prior  to  HITECH,  health  data  sharing  was  usually  limited  to  patient- 
physician  communications,  and  data  interoperability  between  healthcare 
providers  was  limited  to  facsimile  distribution  and  similar  dissemination 
methods.  The  Internet  changed  all  this,  and  the  frequently  cloud-based 
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aggregation  of  EHR  data  in  very  large  collections  (petabyte  data  sets,  for 
example),  sustained  by  big  data  management  and  analytics,  offers 
opportunities  to  understand  diagnoses,  treatments,  and  protocols  on  a large 
scale,  providing  an  important  complement  to  clinical  trials.  Nevertheless, 
big  data’s  promise  of  increased  data  interoperability  and  information 
sharing  has  exacerbated  issues  of  syntactic  conformity  and  semantic  clarity 
that  have  plagued  data  analytics  since  their  inception  in  1960’s  automated 
data  processing  environments.  These  issues  require  more  than  technological 
solutions  because  the  issues  have  their  provenance  in  the  cultural  and 
institutional  determinants  of  the  epistemic  domains  to  which  they  apply, 
rather  than  in  the  IT  systems  that  support  the  analytics  of  such  determinants. 

Current  HIT  capabilities  support  the  integration  of  data  from  diverse 
sources  that  are  frequently  managed  as  data  silos  - for  example,  patient 
clinical  data,  adverse  event  data,  product  data  (drugs,  medical  devices, 
blood,  consumer,  etc.),  environmental  and  toxicological  data,  genomic  data 
industry-provided  data  (insurance,  product,  etc.)  - without  considerations 
of  data  interoperability.  Internally  as  well  as  externally,  epistemic 
communities  in  the  health  domain  support  different  lexica  and  ontologies, 
thereby  restricting  the  possibility  of  efficient  information  sharing.  For 
example,  in  the  clinical  community,  the  International  Statistical 
Classification  of  Diseases  and  Related  Health  Problems  Version  9 and  10 
(ICD-9  and  10),  which,  although  managed  by  the  same  agency,  are  not  fully 
compatible.  Furthermore,  these  two  standards  are  not  fully  compatible  with 
the  Systematized  Nomenclature  of  Medicine-Clinical  Terms,  another 
frequently  used  standard,  and  require  resource-intensive  “cross-walks.”  In 
the  genomics  community,  researchers  employ  at  least  two  standards, 
depending  on  where  they  operate  in  the  world  community:  the  GenBank  file 
format  or  the  Swiss-Prott  format.  In  the  toxicology  domain,  the  US  National 
Institute  of  Environmental  Health  Services’  participation  in  the 
toxicogenomics  ontology  and  global  database  initiatives  is  critically 
important  in  establishing  a common  lexicon  and  ontology. 

There  are  also  different  conveyance  and  transportation  frameworks 
for  transporting  data:  the  HL7  Version  2.x  (V2)  messaging  standard  is, 
arguably,  the  most  widely  implemented  standard  for  health  data  information 
exchange  in  the  world.  However,  this  standard  is  not  compatible  with  the 
Fast  Health  Interoperable  Resources  Health  Information  Exchange  and 
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Clinical  Document  Architecture  (CDA)  specifications  maintained  by  the 
same  organization.  Moreover,  neither  the  ICD-9/10  nor  the  HL7  standards 
are  compatible  with  the  American  National  Standard  X12  Electronic  Data 
Exchange  transactions  (the  274-278  and  834-835  series  of  transactions). 
There  are  also  different  communication  architectures  in  use:  point-to-point 
(peer-to-peer),  and  central  repository  (push/pull),  etc. 

Divergent  Views  of  Product  Requirements 

Clinicians  want  HIT  products  that  are  tailored  to  support  their 
specific  processes  and  protocols  (based  on  my  conversations  with 
practitioners  and  vendors  at  the  2014  AMI  A national  conference).  Vendors 
want  to  capture  the  largest  market  share  possible  at  the  lowest  cost;  hence, 
the  impetus  to  develop  a generic,  one-size-fits-all  solution  as  the  most 
efficient  model.  There  are  also  industry  barriers  to  health  data 
interoperability.  Many  EHR  vendors  treat  healthcare  data  as  proprietary 
assets  that  can  offer  considerable  market  advantages.  Also,  many  lifestyle- 
focused  vendors  (for  example,  in  the  tobacco,  soft  drink,  and  fast  food 
industries)  resist  health  research  and,  consequently,  data  sharing. 

Lack  of  Funding 

The  Federal  government  has  offered  programmatic  incentives  to 
enhance  healthcare  delivery  but  these  are  frequently  insufficient  and  not 
sustained.  For  example,  a number  of  programs,  such  as  the  Beacon 
Community  Cooperative  Agreement  Program,  have  come  to  an  end.  The 
purpose  of  this  program  was  to  demonstrate  how  health  IT  investments  and 
the  use  of  EHR’s  could  advance  the  vision  of  patient-centered  care,  while 
achieving  better  health  and  better  care  at  lower  cost.  The  Health  and  Human 
Services  (HHS)  Office  of  the  National  Coordinator  for  Health  IT  provided 
$250  million  over  three  years  to  17  selected  communities,  each  with  its 
unique  population  and  regional  context,  throughout  the  United  States  that 
had  already  made  inroads  in  the  development  of  secure,  private,  and 
accurate  systems  of  EHR  adoption  and  health  information  exchange.  When 
the  funding  dried  up,  a number  of  Beacon  Communities  incorporated 
elements  of  the  program  into  their  organizational  structures  and  formed 
consortia  at  their  own  costs  but  it  is  not  likely,  in  the  long  run,  that  these 
efforts  can  be  sustained. 
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In  2014,  87%  of  US  hospitals  had  some  form  of  EHR  system  (Cohen 
et  al.  2014).  In  the  US  medical  community  many  large  institutions  have 
adopted  the  use  of  EHRs,  data  sharing,  and  big  data  analytics;  however, 
many  small  practices  do  not  have  the  resources  to  adopt  EHRs  because  they 
are  expensive  and  there  are  few  incentives  to  support  their  adoption.  Local 
communities  and  regional  governments  usually  do  not  have  the  resources 
to  assist  medical  practices  with  adopting  EHR  and  big  data  analytics. 

On  the  international  level,  many  Southern  Cone  countries  do  not 
have  the  resources  to  provide  basic  healthcare  to  their  citizens,  let  alone 
support  an  HIT  infrastructure  required  to  support  EHR-based  medicine,  big 
data  management,  and  analytics  (to  which  UN’s  efforts  to  reach  its 
Millennium  Goals  can  attest;  see  UN  2015).  Likewise,  International  Non- 
Governmental  Organizations  that  have  limited  financial,  organizational, 
and  temporal  resources  must  frequently  operate  in  adversarial  environments 
created  by  host  governments. 

Data  Ownership  and  Data  Stewardship 

There  are  also  institutional  barriers  to  health  big  data  analytics  that 
focus  on  data  ownership  and  stewardship.  Among  the  benefits  of 
introducing  EHR’s  and  Personal  Health  Records  (PHR)  is  to  institutionalize 
patient-focused  healthcare  so  that  patients  become  active  partners  in  their 
healthcare  paradigms  (for  example,  to  mitigate  patients’  strategic 
ignorance:  “my  doctor  knows  what  is  best  for  me,  and  I expect  her  to  notify 
me  when  things  may  go  wrong.”).2  Patients  own  their  data;  the  medical 
establishment  and  the  government  are  data  stewards.  This  uneasy  alliance 
raises  questions  of  when,  and  under  what  circumstances  these  personal  data 
may  be  shared  and  how  personal  identity  data  can  be  protected  against  theft 
and  unauthorized  access.  To  ensure  the  privacy  of  individually  identifiable 
health  information  in  accordance  with  the  Health  Insurance  Portability  and 
Accountability  Act  of  1996  (HIPAA),  health  data  records  must  be 
“anonymized”  by  removing  all  Personally  Identifiable  Information  (PII) 
from  such  records  prior  to  their  use  in  data  analytics.  Data  anonymization 
techniques  are  not  fool-proof.  A recent  study  noted  that,  in  the  absence  of 
PII,  it  is  still  possible  to  join  records  with  a reasonable  degree  of  accuracy 
(for  advertising  purposes)  from  two  discrete  data  sources  based  on  date  of 
birth,  5-digit  residential  zip  code,  and  gender  (cited,  among  others,  by 
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Cavoukian  and  El  Eman  2011;  see  also  Kum  et  al.  2013  for  an  approach  to 
preserving  privacy  in  interactive  record  linkages). 

Legal , Moral,  and  Ethical  Considerations 

In  the  health  domain  big  data  analytics  exacerbates  the  antinomy 
between  healthcare  as  a human  right  and  healthcare  as  a commodity.  The 
UN  Human  Rights  Charter  and  the  Convention  on  the  Elimination  of  All 
Forms  of  Discrimination  against  Women  formulate  a concept  of  human 
rights  that  includes  rights  essential  to  human  development,  such  as  rights  to 
adequate  housing,  healthcare,  education,  economic  development  (for 
example,  employment  at  a fair  wage),  that  apply  to  all  humanity,  regardless 
of  gender,  age,  race,  ethnicity,  sexual  orientation,  etc.  (UN  2002). 
Nussbaum  (2000;  see  also  Sen  1982)  observes  that  bodily  health  is  second 
only  to  life,  supported  by  bodily  integrity,  as  essential  capabilities  necessary 
to  flourish  as  individual  human  beings.  Sen  (1999)  posits  five  types  of 
instrumental  freedoms  as  essential  to  human  freedom  in  the  polity:  political 
freedoms,  including  free  speech  and  free  elections,  to  help  promote 
economic  security;  economic  facilities,  in  the  form  of  opportunities  to 
engage  in  market  activities  and  production;  social  opportunities,  among 
them  access  to  education  and  health  care;  transparency  guarantees,  those 
mechanisms  and  institutions  necessary  to  guarantee  full  disclosure  of 
information  - the  basis  for  trust;  and  protective  security,  those  institutions 
necessary  to  prevent  any  human  from  sinking  into  destitution  and  abject 
poverty.  Although  there  are  moral  strictures  against  using  big  data  analytics 
to  restrict  insurance  coverage  to  individuals,  such  practices  occur.  Similarly, 
in  a “market  model”  of  healthcare  access  and  delivery,  there  are  very  few 
means,  other  than  moral  approbation,  to  restrain  a pharmacological  drug 
company  from  raising  the  price  of  a drug  by,  for  example,  2000%  or  5,500% 
(CBS  News  September  22,  2015;  NBC  News  September  22,  2015). 

Big  data  analytics  also  faces  other  normative  barriers.  For  example, 
patients  are  likely  to  accept  the  necessity  of  data  sharing  among  providers 
to  improve  the  quality  of  care,  but  the  notion  that  their  data  will  be  shared 
with  other  non-provider  third  parties  has  proven  to  be  controversial, 
especially  when  there  are  high-profile  cases  when  data  are  shared  without 
the  owner’s  consent.  The  case  of  Hilda  Lacks  comes  to  mind.  She  was  a 
young  African  American  woman  who,  in  1951,  died  of  cervical  cancer. 
Doctors  took  samples  of  her  cells  without  her  knowledge  and  shared  them 
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with  other  clinicians  and  researchers.  Although  labs  were  selling  samples 
of  what  came  to  be  known  as  the  HeLa  cells,  Lacks’s  family  received  no 
portion  of  the  money  generated  by  those  sales  and  were  not  informed  how 
these  cells  were  used  (Falik  2014).  Misuse  of  big  data  analytics  and  their 
enabling  technologies  have  fostered  an  increasingly  greater  wariness  of 
citizens  of  their  government.  The  majority  of  citizens  understand  that  their 
data  need  to  be  shared  to  support  the  common  weal  but,  as  the  activities  of 
the  National  Security  Agency’s  spying  on  the  U.S.  population  indicate,  the 
citizenry  is  justified  in  its  suspicions  of  its  government. 

Big  Data  Analytics  Workforce  Challenges 

To  be  effective,  big  data  analytics  require  a non-insular,  non- 
compartmentalized  ontological  perspective  and  a multi-disciplinary, 
holistic  approach  to  knowledge  acquisition  that  incorporates  skills  from  a 
variety  of  academic  disciplines,  including,  for  example,  quantitative 
analysis  (statisticians,  computer  scientists),  finance  (financial  analysts,  cost 
analysts,  fraud  analysts),  healthcare  (medical  practitioners,  biologists, 
chemists,  product  engineers),  infrastructure  and  device  engineering 
(communications  and  device  engineers),  social  sciences  (sociologists, 
medical  healthcare  economists),  governance  (policy  analysts),  information 
management  (information  governance  experts  and  managers,  librarians), 
deontology  (ethicists),  jurisprudence  (legal  professionals).  The  majority  of 
universities  and  training  institutes  do  not  offer  cross-field  programs  that 
emphasize  the  integration  of  these  skills.  As  a result,  one  of  the  roles 
frequently  overlooked  in  efforts  to  minimize  the  risks  of  the  misuse  of  big 
data  analytics  is  the  role  of  the  ethicist.  If  a little  data  analytics  can  lead  to 
misuse,  big  data  analytics  can  lead  to  even  greater  misuse  because  so  many 
more  data  are  available  for  abusive  practices. 

Future  Trends 

The  study  indicates  a number  of  trends.  Cloud-based  big  data 
analytics  and  usage  will  grow.  Big  data  analytics  and  data  interoperability 
have  introduced  increased  concerns  for  effective  privacy  and  security 
management,  defense  against  data  breaches,  and  data  storage  management. 
To  address  these  concerns,  government  participation  in  developing, 
promulgating,  and  enforcing  standards  will  increase  (for  example,  NIST 
Standards  for  cloud-based  security).  Business  intelligence  (BI)  vendors  will 
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expand  their  offerings  to  accommodate  very  large  data  sets  and  big  data 
analytics.  Legal  and  human  rights  debates  will  become  more  contentious, 
especially  about  topics  such  as  data  ownership  and  stewardship  as  they 
relate  to  data  sharing  and  individual  privacy.  In  the  health  domain  EHR 
adoption  will  continue  sporadically  and  in  geographic  isolation,  especially 
by  the  less-funded  practices.  Because  of  the  extensive  knowledge  base 
required  to  perform  big  data  analytics  effectively  and  ethically,  big  data 
analytics  will  become  increasingly  the  domain  of  intellectual  and  educated 
elites. 
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Endnotes 


1 See,  for  example,  periodic  issues  dedicated  to  the  use  of  big  data  published  in  such 
trade  journals  Federal  Computing  Weekly  or  Healthcare  Informatics.  Frequency  of 
webinars  dedicated  to  big  data  offered  by  The  Data  Warehouse  Institute  (TWDI)  may 
also  prove  instructive. 

2 With  the  advent  of  big  data  and  the  increasing  of  HTML-based  EHRs  (with  the  HL7 
Consolidated  Clinical  (CCDA),  it  is  possible  to  embed  genomic  data  in  a patient’s  EHR, 
offering  the  possibility  of  developing  individualized  medical  protocols  for  patients. 
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Big  Data:  Who’s  Accountable? 

Jong-on  Hahm 

George  Washington  University 

Abstract 

Data  analysis.  Big  or  Small,  requires  careful  handling  of  data  to  ensure  against 
sorting  bias  and  errant  categorization  that  can  lead  to  inaccurate  conclusions. 
Sorting  error  may  be  introduced  when  attempting  to  hannonize  existing  datasets 
with  new  datasets  that  offer  many  more  parameters.  Caution  in  data 
categorization,  searching  for  specific  factors,  and  drawing  conclusions,  is 
paramount  for  policymakers  looking  to  use  Big  Data  for  societal  benefits. 

A state  decides  to  SET  aside  economic  development  zones  and  wants  to 
encourage  minority  residents  to  establish  businesses  and  hire  workers.  To 
determine  target  areas  for  public  outreach,  data  are  scraped  from  publicly 
available  sources,  cleaned,  analyzed,  and  mapped  to  identify  communities 
where  resources  should  be  expended.  When  program  administrators  pull  up 
the  first  map,  they  are  surprised  to  discover  complete  blanks  in  regions 
where  significant  populations  of  minorities  are  known  to  reside. 

A retailer  planning  an  expansion  into  a new  region  pursues  a 
marketing  scheme  intended  to  identify  specific  populations.  The  first 
advertising  blitz  includes  a direct  mail  campaign  where  residents  are  offered 
free  trials  and  samples  of  products.  After  the  first  mailers  are  sent,  the 
retailer  is  bombarded  with  negative  feedback  and  complaints  about 
inappropriate  and  offensive  product  information. 

A consulting  firm  hired  to  improve  efficiency  and  service  in  a 
surgical  unit  at  a hospital  talks  to  every  staff  member  of  the  surgical  unit.  It 
devises  an  electronic  tracking  system  to  harmonize  scheduling,  smooth 
patient  transfer  protocols  and  keep  the  unit  at  high  functional  capacity. 
Within  the  first  week,  the  schedule  has  bogged  down  completely  and 
patients  have  had  to  be  referred  to  nearby  hospitals. 

In  all  of  the  above  examples,  something  in  the  use  of  data  has  led  to 
unintended,  sometimes  risky  outcomes.  As  has  become  clear,  the  use  of  Big 
Data  presents  its  own  set  of  methodological  and  analytical  challenges.  The 
question  then  arises:  when  Big  Data  goes  wrong,  who’s  accountable? 
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The  advent  of  Big  Data  has  seduced  the  unwary  into  the  promise  of 
the  possible,  overlooking  the  promise  of  accompanying  problems.  In 
enormous  data  sets,  very  slight  differences  in  sorting  and  categorization  can 
lead  to  large  differences,  particularly  as  data  are  collected  into  ever-larger 
sets.  In  a long-term  study,  when  data  increase  by  orders  of  magnitude  over 
time,  minute  differences  could  grow  into  significance  and  result  in  greatly 
disparate  impacts. 

When  examining  accountability,  the  goal  should  not  be  to  conduct  a 
forensic  analysis  of  where,  what,  and  how  things  went  wrong,  but  rather  to 
establish  parameters  from  the  outset  that  prevent  such  errors  in  the  first 
place.  In  Big  Data  analyses,  critical  issues  must  be  considered  in  the 
research  design:  sorting  bias  and  harmonization  of  old  and  new  datasets. 

Oftentimes,  data  collection  is  driven  by  the  tools  available.  For 
example,  if  a research  project  will  use  a certain  analysis  program  or 
approach,  data  will  likely  be  collected  in  a format  most  amenable  for  use 
with  that  program.  If  the  data  become  unwieldy  or  the  desired  granularity  is 
different,  sorting  characteristics  may  be  changed  to  make  it  work  better  with 
the  analytical  tool.  Sometimes,  data  may  be  sorted  according  to  a 
researcher’s  own  categorization  algorithm  without  any  conscious 
realization  of  such  sorting. 

In  2013,  the  Wikipedia  community  noticed  that  its  list  of  “American 
novelists”  no  longer  contained  any  women.  They  had  all  been  placed  on  a 
separate  list  of  “American  women  novelists”  (Filipacchi  2013).  At  the  time, 
Wikipedia  wanted  to  make  the  list  for  “American  novelists”  less  unwieldy, 
and  began  creating  subcategories  (Neary  2013).  This  separation  unleashed 
a firestorm  of  criticism  from  those  who  perceived  the  categorization  as 
sexism,  as  there  was  no  subcategory  for  “American  men  novelists” 
(McDonough  2013;  Flood  2013). 

Whereas  the  constraints  of  Wikipedia’s  platform  may  have  led  to 
this  reclassification,  the  bias  in  characterizing  standard  “American 
novelists”  to  be  male  illustrates  the  potential  for  sorting  bias  in  handling 
data  even  before  analysis  is  attempted.  Data  can  be  altered  through 
stratification,  separation,  or  combination.  Data  cleaning  can  alter  datasets 
such  that  nuances  of  the  raw  data,  small  shifts,  offsets  or  trends  that  may 
indicate  the  influence  of  unanticipated  factors  is  lost. 
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The  negative  impact  of  bias  in  data  analysis  and  its  impact  have  been 
noted,  sometimes  in  spectacular  fashion.  Photo  auto-tagging  programs 
developed  by  Yahoo  and  Google  led  to  African-Americans  being  tagged 
with  terms  like  “gorilla”  and  “ape”  (Dwoskin  2015).  Particularly  in 
software  that  uses  machine  learning  algorithms,  a biasing  factor  such  as 
selective  data  sets  used  for  training  can  push  the  learning  in  an  unintended 
direction. 

Unfortunately,  avoiding  bias  is  proving  to  be  a much  more  daunting 
task  than  previously  conceived.  According  to  Valerio  Pascucci,  a leading 
researcher  (XSEDE15,  July  28,  2015),  simply  looking  for  something  in  a 
large  data  set  will  inject  bias  into  the  data  analysis  (Gibson  2015). 

Ideally,  such  unintended  directional  tacks  will  not  occur  in  most 
research  projects.  Data  will  be  derived,  for  the  most  part,  from  known  data 
sets,  analysis  will  use  well-established,  commonly  used  methods,  data 
comparability  and  compatibility  will  not  be  an  issue.  The  challenge  arises 
when  new  data  sets,  offering  richer,  more  informative  views  of  a subject 
population  become  available,  and  researchers  want  to  incoiporate  the  new 
information  with  the  established  set.  At  this  point,  the  difficulty  of  data 
harmonization,  in  even  the  most  basic  ways,  becomes  evident. 

As  computational  capacity  increases,  an  obvious  difficulty  arises  in 
matching  categories  end  to  end.  The  US  Census  in  2020  will  include  an 
expanded  range  of  choices  for  race  and  ethnicity  to  reflect  the  growing 
percentage  of  Americans  who  identify  as  multiracial  (Pew  Research  Center 
2015). 

In  the  example  of  the  state  wanting  to  spur  minority 
entrepreneurship  and  workforce  development,  the  state  government  may 
have  wanted  to  harmonize  existing  data  with  newly  available  data  on  its 
minority  populations.  The  state  may  have  had  to  rely  on  existing  district 
maps  with  much  more  limited  population  information.  Constrained  by 
budget  restrictions  and  the  tools  available,  harmonization  of  data  sets  may 
not  have  been  as  tailored  as  could  be,  leading  to  incorrect  targeting  of 
desired  populations.  The  retailer  may  have  introduced  sorting  bias  that 
conflated  demographics  with  product  interest. 

The  third  example  is  based  on  an  actual  case  and  illustrates  how 
sorting  bias  can  creep  into  even  Small  Data  analyses.  The  consultant  hired 
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to  help  improve  surgery  scheduling  developed  an  electronic  scheduling 
system  to  replace  a whiteboard  on  which  surgery  schedules  were  manually 
written  with  markers.  The  system  degenerated  because  the  consultants  did 
not  take  into  consideration  the  key  role  played  by  technicians  who 
maintained  supply  stocks  in  the  surgical  suites.  The  technicians  were  not 
interviewed  in  the  original  data  collection,  nor  were  they  provided  access  to 
the  new  electronic  scheduling  system.  Hence,  they  did  not  stock  the  suites 
appropriately  for  each  procedure.  After  a week  of  confusion  and  referral  of 
patients  to  nearby  hospitals  (with  concomitant  loss  of  revenue),  surgical 
scheduling  reverted  to  the  whiteboard  system. 

The  very  nature  of  Big  Data  underscores  the  potential  impact  of 
infinitesimal  differences  that  can  become  magnified  in  the  petabytes  of  data 
being  generated  and  mined  for  meaningful  information.  While  analytical 
models  can  be  changed  with  little  imprint,  impact  on  society  and 
communities  lingers  on. 
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Abstract 

The  availability  and  usability  of  massive  data  sets  have  added  to  the  increasing 
popularity  of  big  data  research.  However,  common  mechanisms  of  big  data 
collection  ( e.g .,  social  media,  open  source  platforms,  and  other  online  user  data) 
can  be  problematic.  Sampling  issues,  especially  selection  bias,  associated  with 
these  data  sources  can  have  far  reaching  implications  for  analysis  and 
interpretation.  This  paper  examines  the  types  of  sampling  issues  that  arise  in  big 
data  projects,  how  and  why  biases  occur,  and  their  implications.  It  concludes  by 
providing  strategies  for  dealing  with  sampling  and  selection  bias  in  big  data 
projects. 


From  the  dawn  of  humankind  to  the  year  2003,  it  is  estimated  that  5 
exabytes  (101S  bytes)  of  data  were  created  by  humans.  Today,  humans  create 
about  2.5  exabytes  of  data  every  day  (Intel  IT  Center  2012;  Sagiroglu  and 
Sinanc  2013).  This  explosion  of  information,  due  in  large  part  to 
developments  in  data  mining  and  collection,  data  warehousing  and  storage, 
and  computational  capacity  and  performance,  has  led  to  “the  era  of  big 
data.”  Individual-level  data  can  now  be  collected  and  mined  using  online 
platforms,  social  media,  and  cell  phone  applications,  giving  big  data 
researchers  increasing  levels  of  insight  into  previously  unobserved 
behavioral  patterns  and  other  “found  data”  (Harford  2014;  Tufekci  2014; 
Fan  et  al.  2014;  Sagiroglu  and  Sinanc  2013;  Yang  and  Wu  2013;  De  Mauro 
et  al.  2014). 

With  this  influx  of  data  researchers  have  been  able  to  make  significant 
strides  in  fields  such  as  health  care,  finance  and  economics,  and  social 
science;  however,  big  data  research  is  not  a panacea  for  data  analytics. 
Though  some  data  scientists  subscribe  to  the  “myth  of  large  n”  - i.e.,  when 
data  are  “big”  enough  biases  are  not  significant -statistical  errors  and  biases 
can  still  impact  research  findings  regardless  of  the  size  of  the  dataset 
(Harford  2014;  Anderson  2008;  Lazer  2014).  Due  to  the  nature  of  data 
collection  and  mining,  and  the  methods  used  therein,  big  data  research  may 
be  particularly  susceptible  to  sampling  biases  (Tufekci  2014;  Fan  et  al. 
2014;  Harford  2014). 
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This  paper  will  explore  the  dangers  in  the  “myth  of  large  n”  by 
examining  issues  related  to  selection  bias  in  big  data  research  in  particular, 
and  will  attempt  to  assess  the  extent  to  which  big  data  projects  may  be 
affected  by  selection  bias,  the  implications  of  this  bias  for  research,  and 
potential  strategies  for  accounting  for  such  limitations.  We  choose  to  focus 
on  selection  bias  due  to  both  the  popularity  of  found  data  and  mined  social 
media  data  in  big  data  analyses  and  the  likelihood  that  these  sources  produce 
non-random  samples. 

Big  data  are  characterized  by  the  “3  V’s”,  volume  (amount  and  size), 
velocity  (real  time  or  batch),  and  variety  (structured  or  unstructured) 
(Sagiroglu  and  S inane  2013;  De  Mauro  et  al.  2014).  The  source  and  utility 
of  big  data  can  take  many  forms.  Companies  like  Wal-Mart  and  Target 
collect  and  analyze  real-time  purchase  data  to  predict  consumer  preferences, 
while  economic  researchers  collect  cell  phone  location  data  to  determine 
the  distance  consumers  are  willing  to  travel  to  a shopping  mall,  a proxy  for 
consumer  demand  and  economic  strength  (Lazer  2014;  Bollier  2010). 
Health  and  life  science  researchers  have  harnessed  big  data  and  increased 
computing  power  to  revitalize  genomic  sequencing,  compressing  what  had 
been  a ten-year  process  to  less  than  a week  (Fan  et  al.  20 1 4;  Harford  20 1 4). 

Covering  all  “3  V’s”,  data  mined  from  social  media,  cell  phone 
applications,  and  open  source  online  platforms  provide  big  data  researchers 
with  unique  insight  into  human  behavior  and  interactions  by  providing 
large,  real  time  data  on  their  users  and  their  content  (Bollier  2010).  With 
such  large  n’s,  big  data  research  offers  interesting  new  ways  to  conduct 
analyses.  In  lieu  of  the  traditional  method  of  formulating  hypotheses  and 
theory  before  analyzing  the  data,  big  data  researchers  often  take  a high-level 
look  at  massive  data  sets,  noting  interesting  or  unexpected  correlations,  and 
then  forming  hypotheses  and  theories  around  those  correlations  (Lohr  2012; 
Harford  2014;  Anderson  2008;  Bollier  2010).  This  exploratory  approach 
has  led  some  to  term  the  era  of  big  data  as  the  “end  of  theory”  (Anderson 
2008;  Bollier  2010),  suggesting  that  deductive  reasoning  grounded  in 
previous  research  literature  is  no  longer  necessary  with  such  large,  timely, 
and  varied  data. 

By  pursuing  these  exploratory  approaches,  and  making  claims  based  on 
discovered  correlations,  researchers  risk  falling  prey  to  the  traditional 
limitations  and  biases  inherent  in  both  statistical  and  social  research.  When 
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researchers  subscribe  to  the  “myth  of  large  n”,  or  the  “n  = all”  suppositions, 
certain  faults  of  big  data  may  be  ignored  (Lazer  20 1 4;  Bollier  20 1 0;  Harford 
2014).  In  particular,  big  data  are  especially  susceptible  to  endogeneity,  auto- 
correlation of  errors,  spurious  correlations,  and  selection  bias  (Fan  et  al. 
2014;  Lazer  2014;  Tufekci  2014;  Harford  2014). 

The  emphasis  on  social  media  data  mining  and  other  data  collection 
from  open  source  platforms  and  applications  increases  big  data 
vulnerability  to  selection  bias  in  particular  (Fan  et  al.  2014).  Selection  bias 
describes  the  bias  that  is  present  when  the  selection  of  a sample  or  study 
group  is  such  that  proper  randomization  is  not  achieved  and  the  sample  is, 
therefore,  not  representative  of  the  larger  population  (Berk  1983;  Heckman 
1979).  Figure  1 shows  a classic  image  that  resulted  from  selection  bias.  The 
Chicago  Tribune , which  relied  heavily  on  telephone  surveys  for  their 
election  predictions,  prematurely  claimed  Dewey  as  the  winner  of  the  1948 
presidential  election.  The  high  cost  of  telephone  lines  led  to  biased  results 
as  affluent  Americans  were  more  apt  to  support  Dewey  than  those  who  were 
less  affluent.  Selection  bias  is  particularly  problematic  when  relying  on 
exploratory  research  and  analyzing  correlations,  since  self-selected  samples 
often  exhibit  different  correlational  tendencies  than  random  samples.  Most 
important  for  big  data  analysis  is  the  presence  of  confounding  variables,  or 
that  persons  who  self-select  into  certain  groups  often  have  other  variables 
in  common  that  researchers  are  not  accounting  for  such  as  demographic 
factors  or  similar  environments,  which  cause  the  confounding  variables 
(Tufekci  2014;  Fan  et  al.  2014). 

More  simply,  selection  bias  describes  the  likelihood  that  certain  persons 
or  groups  are  more  apt  to  be  picked  up  by  big  data  collection  efforts  than 
others,  whether  due  to  their  use  of  social  media  and  open  source  platforms, 
the  availability  of  internet  connectivity  in  certain  areas,  their  ability  to 
purchase  smart  phones  and  access  applications,  or  any  other  number  of 
omitted  variables.  For  example,  StreetBump,  a smart  phone  based 
application  rolled  out  in  Boston,  sought  to  record  potholes  while  users  were 
driving  and  report  these  potholes  to  the  city  for  repair.  Eventually  when 
potholes  were  being  disproportionately  reported  in  affluent  neighborhoods, 
a deeper  analysis  revealed  an  issue  with  selection  bias.  Residents  in  affluent 
neighborhoods  were  far  more  likely  to  own  smart  phones  with  network 
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access — and  cars — than  residents  who  lived  in  lower  income 
neighborhoods  (Harford  2014). 


Figure  1.  Classic  Case  of  Selection  Bias:  Chicago  Tribune  Declares  “Dewey  Defeats 
Truman”  (photo  credit:  Associated  Press). 

These  demographic  findings  are  consistent  with  2014  Pew  Research 
Center  survey  findings  on  smart  phone  users.  Pew’s  survey  estimated  that 
about  65  percent  of  American  adults  are  smart  phone  users,  but  that 
population  is,  on  average,  under  40  years  old,  college-educated,  and  affluent 
(i.e.,  average  incomes  of  more  than  $75,000  per  year),  and  far  more  likely 
to  be  employed  than  non-smart  phone  users  (Pew  Research  Center  2015). 
Given  the  demographics  of  smart  phone  users,  it  is  problematic  to  use  smart- 
phone data  to  make  general  claims  about  the  U.S.  population.  Additionally, 
these  problems  may  be  compounded  if  researchers  take  exploratory 
approaches  and  simply  analyze  the  data  for  interesting  correlations,  as 
additional  context  is  usually  needed  to  determine  what  variables  are  driving 
the  correlation  and  what  factors  may  cause  the  correlation  to  break  down 
(Harford  2014). 

Selection  bias  has  serious  implications  when  left  uncontrolled  in  a 
standard  linear  model  as  it  creates  a non-linear  relationship  between  the 
dependent  and  key  independent  variable,  such  that  a causal  relationship  may 
be  misinterpreted  and  effects  resulting  from  random  noise  in  the  model  are 
mistaken  for  causal  effects,  affecting  both  internal  and  external  validity. 
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External  validity  is  undermined  when  selection  bias  is  present  by 
underestimating  the  slope  of  the  regression  line,  often  leading  to  causal 
effects  being  underestimated.  Internal  validity  may  be  similarly 
compromised  if  the  effect  of  the  “exogenous  variable  and  the  disturbance 
term  are  confounded”,  leading  to  causal  effects  of  an  independent  variable 
being  confused  with  random  noise  in  the  data  (Berk  1983,  p.  388).  By 
failing  to  formulate  a theory  and  corresponding  model,  researchers  are  often 
unable  to  control  for  significant  selection  biases  and  jeopardize  the  validity 
of  their  research  (Berk  1983;  Heckman  1979). 

Other  big  data  mining  and  collection  efforts  that  have  assumed  “large  n 
= no  bias,”  analyze  social  media  user  data.  For  example,  Twitter-generated 
data  has  become  quite  popular  among  big  data  researchers  because  of  the 
ease  of  data  collection  along  with  its  connection  to  user  data,  such  as  tweet 
content,  retweets,  and  engagement  in  trending  topics.  However,  only  23  of 
U.S.  adults  already  online  use  Twitter,  and  among  that  23  percent,  the 
population  is  largely  minority  (African-American  and  Hispanic)  youth,  with 
about  37%  of  Twitter  users  under  the  age  of  30  (Duggan  et  al.  2015).  Big 
data  researchers  often  use  Twitter  to  gauge  public  opinion  on  hot  issues  or 
learn  more  about  consumer  preferences;  however,  these  data  are 
problematic  because  Twitter  users  are  not  comparable  to  the  U.S. 
population  as  a whole  (Tufekci  2014). 

Using  sources  such  as  Twitter  presents  other  unique  sampling  and 
selection  issues.  A common  avenue  of  big  data  analysis  using  Twitter  is 
“hashtag  analysis”;  that  is,  using  Twitter’s  linking  system  (tagging  a post 
with  a “#”  connects  that  post  with  a live  feed  of  all  users  tweeting  about  that 
particular  topic)  to  gauge  public  opinion  on  timely,  hot  button  issues. 
Research  by  Tufekci  (2014)  highlights  the  inherent  problem  of  selecting 
cases  based  on  the  dependent  variable.  That  is,  users  are  only  able  to  be 
included  in  the  sample  if  they  have  tagged  their  tweet  appropriately.  This 
issue  causes  researchers  to  overlook  cases  where  the  user  has  not  linked  the 
post  but  is  still  engaged  in  the  larger  conversation,  and  is  necessarily  subject 
to  self-selection  bias,  as  the  user  has  made  a conscious  choice  to  tag  their 
post  and  include  their  content  in  the  larger  discussion  (Tufekci  2014; 
Geddes  1990).  Additionally,  those  hashtags  that  are  used  for  analysis  are 
those  that  were  successful  (/%.,  generated  a large  base  of  users  engaging  in 
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the  conversation),  and  are  therefore  different  than  those  hashtags  that  were 
unsuccessful  in  generating  conversation. 

There  are  several  ways  researchers  may  account  for  selection  bias  in  big 
data  analysis.  Traditionally,  selection  biases  are  controlled  for  through 
statistical  modeling.  Researchers  should  examine  the  overall  demographics 
of  their  data  source  and  attempt  to  control  for  confounding  variables. 
Certain  models  and  statistical  methods  are  also  better  equipped  to  handle 
such  biases,  such  as  a regression  discontinuity  design  (Taylor  2014), 
difference-in-difference  models  with  a matching  model  as  suggested  by 
Heckman  (1979),  and  a non-linear  Tobit  model,  used  by  both  Heckman 
(1979)  and  Berk  (1983). 

Additionally  not  all  selection  bias  is  problematic.  In  some  cases,  notably 
market  research,  selection  bias  enables  greater  targeting  of  advertising.  For 
example,  retailers  use  “just-in-time”  coupon  delivery  to  target  specific 
buyers.  Entertainment  services  like  Netflix  and  Amazon  use  similar 
methods  to  make  suggestions  to  viewers.  Likewise,  in  research  projects 
where  motivated  participants  are  desired,  the  act  of  participating  in  a 
hashtag  conversation,  itself  may  be  noteworthy. 

In  the  example  of  self-selection  in  hashtag  analysis  using  Twitter, 
researchers  can  better  account  for  the  confounding  variable  issues  present 
in  a self-selected  sample  by  going  beyond  exploratory,  correlational 
analyses.  Twitter  datasets  should  not  be  considered  random  or 
representative;  rather  these  data  should  be  recognized  as  self-selected  and 
missing  data  treated  as  “missing  not  at  random”,  or  missing  due  to 
unobserved  or  unknown  variables.  In  these  analyses,  it  is  also  worthwhile 
for  researchers  to  examine  the  cultural  and  social  contexts  of  these  “trending 
topics”  and  interpret  their  findings  appropriately  (Tufekci  2014;  Meiman 
and  Freund  2012). 

Additionally,  big  data  research  projects  can  be  strengthened  by  pulling 
dependent  variables  from  external,  validated  sources.  For  example,  a study 
examining  political  attitudes  using  Twitter  or  Facebook  content  data  as 
independent  or  control  variables  could  be  strengthened  by  using  voting 
behavior  or  voting  registration  data  from  the  U.S.  Census  as  a dependent 
variable.  This  strategy,  in  particular,  is  useful  in  guarding  against  “selecting 
on  the  dependent  variable”,  as  discussed  by  Tufekci  (2014).  Alternatively, 
researchers  may  use  outside,  reliable  sources,  like  the  U.S.  Census,  to 
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benchmark  their  findings,  acting  as  a sort  of  “gut  check”  for  the  findings.  It 
may  also  be  worthwhile  for  big  data  researchers  to  explore  mixed-methods 
approaches  to  their  studies.  By  complementing  big  data  analyses  with 
surveys,  interviews,  and  other  data  collection  methods,  researchers  can 
better  understand  the  larger  context  of  their  data  and  provide  more  robust, 
representative  findings. 

Big  data  research  seems  poised  to  revolutionize  data  analytics  as  we 
know  it.  By  amassing  such  large  amounts  of  data,  researchers  can  observe 
correlations  that  may  not  manifest  in  smaller  samples  and  can  analyze  large, 
near  real  time  streams  of  data  from  numerous  sources.  While  detecting 
previously  missed  correlations  can  spur  new  research  questions  and  new 
understandings  of  processes  in  fields  like  health  and  social  science,  it  is  not 
a substitute  for  established  theory,  hypothesis  development  grounded  in  the 
extant  social  science  literature,  and  statistical  modeling.  These  data,  as 
shown  through  this  paper,  may  also  be  subject  to  selection  biases,  which 
can  skew  the  findings  and  implications  of  the  research.  By  incorporating 
more  “small  data”  methods  and  techniques  into  research,  such  as  mixed- 
method  studies,  alternate  non-linear  models,  and  benchmarking,  big  data 
analysts  can  strengthen  their  studies  and  findings  and  advance  big  data 
research. 
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Abstract 

The  number  of  degree  programs  in  analytics  and  data  science  is  increasing 
rapidly.  Because  of  the  strong  industry  demand  for  highly  qualified  analytics 
professionals,  the  need  for  education  will  continue  to  grow.  Current  programs 
provide  strong  coverage  of  the  infrastructure  and  applications  of  analytics  and 
data  science,  but  they  are  lacking  in  the  coverage  of  their  legal,  ethical,  and 
societal  implications.  We  argue  that  every  analytics  and  data  science  program 
should  include  a significant  emphasis  on  the  implications  and  potential 
consequences  of  data  science  applications.  Including  these  elements  in  the 
programs  will  help  analytics  professionals  understand  better  the  complex  and 
nuanced  relationships  between  their  work  and  various  stakeholders  of  the 
context  in  which  the  work  takes  place.  Data  scientists  and  analysts  who  are 
sensitive  to  data  downsides  as  well  as  upsides  enable  organizations  to  avoid 
harmful  consequences  of  analytics  applications  but  still  achieve  the  benefits. 

Introduction 

Nothing  is  more  crucial  to  the  achieving  the  promises  of  big  data  than 
a workforce  of  capable  individuals  prepared  to  tackle  the  opportunities  and 
challenges  of  analytics.  Many  commentators  have  mentioned  projected 
shortfalls  in  the  number  of  people  qualified  to  fill  the  data  scientist  role 
(Craig  et  al.  2012,  2013;  Manyika  et  al.  2011).  A few  analysts  have  also 
pointed  to  the  need  for  technical  support  specialists  who  manage  data 
preparation,  storage,  and  related  tasks  (Woo  2013).  Yet  others  have 
discussed  deficiencies  in  the  ability  of  managers  and  subject  matter  experts 
to  sponsor,  supervise,  and  take  action  on  the  results  of  analytic  projects 
(Court  2015;  Davenport  2013). 

The  discussion  has,  however,  paid  less  attention  to  what  these 
people  need  to  know.  Through  our  participation  in  NSF-sponsored 
workshops  on  data  science  education1  and  big  data’s  social,  economic  and 
workforce  implications,2  we  have  identified  important  knowledge  gaps 
related  to  the  legal,  ethical,  and  social  implications  of  data  science.  In  this 
contribution  to  the  Symposium,  we  lay  out  the  current  state  of  data  science 
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education  and  make  the  case  for  more  attention  to  big  data’s  implications 
and  consequences  in  data  science  education. 

Data  Science  Degree  Programs 

The  demand  for  professionals  with  proficiency  in  data  science  and 
analytic  techniques  has  increased  substantially  during  the  last  few  years, 
and  many  media  accounts  have  predicted  a significant  shortage  of  capable 
professionals  in  this  area.  A widely  cited  McKinsey  report  (Manyika  et  al. 
2011)  predicted  a shortfall  of  nearly  200,000  knowledge  professionals  with 
in-depth  preparation  in  analytics.  Industry  demand  for  graduates  with  this 
background  has  increased,  and  universities  around  the  world  have 
responded  by  launching  both  bachelor’s  and  master’s  programs. 

North  Carolina  State  University’s  Institute  for  Advanced  Analytics, 
an  educational  pioneer,  maintains  a database  of  related  U.S.  master’s 
programs.  In  September  2015,  this  list  included  34  programs  in  Analytics, 
19  in  Data  Science,  and  54  in  Business  Analytics.  Exact  numbers  of  students 
are  difficult  to  estimate,  but  program  heads  frequently  boast  of  their  success 
in  attracting  students  from  around  the  world.  The  largest  programs  admit 
hundreds  of  students  annually,  and  all  appear  to  bring  in  at  least  dozens.  In 
total,  data  science  and  analytics  programs  already  graduate  thousands  of 
students  per  year  in  the  U.S.  alone.  The  focus  on  data  scientist  preparation 
is  not,  however,  only  a U.S.  phenomenon;  new  programs  are  also  launching 
in  Europe  and  Asia.  We  confidently  expect  the  number  of  analytics 
graduates  to  continue  to  increase  for  some  time. 

Data  Science  Curricula 

The  disciplinary  focus  and  orientation  of  data  science  and  analytics 
programs  vary  significantly,  and  so  do  their  curricula.  University 
departments  of  statistics,  mathematics,  computer  science,  information 
systems,  management  science/operations  research,  and  information  science 
have  set  up  programs,  and  so  have  schools  and  departments  that  focus  on 
particular  sectors,  such  as  health  care,  finance,  and  the  hard  sciences.  Some 
programs  are  interdisciplinary,  crossing  several  departments  or  schools. 
Other  programs  ( e.g .,  most,  but  not  all,  business  analytics  programs)  are 
housed  in  single  schools. 
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Some  data  science  and  analytics  programs  focus  primarily  on  the 
application  of  data  science  techniques  to  real  world  problems  in  science, 
engineering,  health  care,  financial,  educational  policy,  and  the  like.  These 
programs  tend  to  heavily  emphasize  data  analytic  techniques  such  as 
traditional  statistical  analysis,  data  mining,  text  mining,  time  series  analysis, 
simulation,  optimization,  and  machine  learning,  as  well  as  domain 
knowledge.  In  addition,  because  graduates  are  expected  to  work  closely 
with  subject  matter  experts,  these  programs  often  include  attention  to 
general  professional  competences  such  as  critical  thinking,  oral  and  written 
communication,  collaboration  and  teamwork,  and  consulting  skills  for 
eliciting  project  requirements.  The  applications  of  data  science  increasingly 
involve  continuous,  real-time,  algorithmic  analysis  of  large  quantities  of 
data  in  a way  that  enables  automated  organizational  decision-making. 
Therefore,  some  analytics  programs  (particularly  in  business  schools)  focus 
on  the  role  of  analytics  in  the  digital  transformation  of  organizations. 

Other  data  science  programs  focus  more  heavily  on  the  activities 
involved  in  providing  the  underlying  support  for  data  science  and  analytics 
applications;  these  programs  can  be  described  as  building  skills  in  the 
technical  infrastructure  of  data  science.  Examples  of  the  topics  emphasized 
in  these  programs  are  programming,  algorithms  and  data  structures,  data 
visualization  approaches,  data  warehousing,  data  management  for 
structured  and  unstructured  data,  and  so  forth. 

A Missing  Emphasis 

In  addition  to  applications  and  infrastructure,  a third  key  body  of 
knowledge  relevant  to  data  science  and  analytics  concerns  their  legal, 
ethical,  and  societal  implications  (see  Table  1 that  illustrates  the  three 
bodies  of  Data  Science  knowledge  based  on  Markus  and  Topi,  2015).  One 
can  hardly  mention  the  topic  of  big  data  without  evoking  privacy  concerns, 
and  discussion  of  security  issues  often  follows  closely  behind.  Also  relevant 
are  concerns  about  illegal  discrimination,  behavioral  manipulation, 
harassment,  and  inappropriate  “social  sorting”  (or  labeling  people  via 
identity  profiles)  (Markus  and  Topi  2015).  These  legal,  ethical,  and  societal 
implications  of  big  data  are  rarely  given  attention  proportionate  to  their 
importance  in  data  science,  analytics  programs,  or  educational  materials 
(Provost  and  Fawcett  2013).  Instead,  such  topics  are  typically  covered  in 
educational  programs  of  law,  accounting  information  systems,  social 
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sciences,  and  public  policy,  where  data  scientists-in-training  may  not  be 
exposed  to  them. 

Table  1:  Three  Bodies  of  Data  Science  Knowledge 
(adapted  from  Markus  and  Topi,  2015) 

Applications 

• Use  of  data  science  tools  and  techniques  to  generate  new  insights  in 
domains  such  as  marketing,  health  care,  law,  finance,  science 

• Use  of  data  science  tools  and  techniques  to  fully  or  partially 
automate  previously  manual  decision-making  processes  such  as  the 
auctioning  of  advertising,  mortgage  or  insurance  underwriting, 
medical  diagnosis,  e-discovery,  securities  trading,  identification  of 
promising  drug  molecules 

• Development  of  new  “apps”  and  data-oriented  business  processes 
and  business  models 

Infrastructure 

• Development  of  new  tools  and  techniques  for  data  handling  (e.g., 
extracting,  transferring  and  loading  data,  data  storage,  tagging  and 
curating  data,  cleaning  and  verifying  data) 

• Development  of  new  software  and  hardware  tools  and  techniques 
for  data  analysis  and  interpretation  (e.g.,  text  mining,  data 
visualization,  machine  learning) 

Implications 

• Laws  and  regulations  governing  data  protection,  data  security,  and 
data  management  requirements  (e.g.,  document  retention  and 
destruction) 

• Design  of  organizational  structures  and  governance  mechanisms 
that  promote  responsible  (legal,  ethical,  and  socially  acceptable) 
data  collection  and  use  practices 

• Potential  positive  and  negatives  consequences  of  big  data 
applications,  tools  for  anticipating  them,  and  strategies  and 
techniques  for  minimizing  negative  side-effects. 
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Most  data  science  and  analytics  degree  programs  appear  to  take~ 
either  implicitly  or  explicitly — a strong  pro-innovation  stance,  implying 
that  the  consequences  of  big  data  are  uniformly  positive  or,  if  negative, 
easily  remediable.  We  believe  that  this  stance  creates  important  gaps  in  the 
preparation  of  the  future  data  science  and  analytics  professionals.  In 
particular,  we  believe  that  they  are  underprepared  for  the  significant  ethical 
challenges  they  are  likely  to  confront  throughout  their  careers. 

Why  an  Implications  Focus  Is  Essential 

Over  the  last  century,  thoughtful  scientists,  engineers,  and  technical 
professionals  have  taken  powerful  stands  on  the  uncertain  or  potentially 
negative  consequences  of  innovations  such  as  fossil  and  atomic  energy, 
genetic  engineering,  and  nanotechnology.  Appropriate  responses  to  such 
issues  are  never  easy  to  find  and  are  always  contested,  but  nothing  is  gained 
by  sweeping  the  issues  under  the  rug.  Failure  to  raise  concerns  and  debate 
the  issues  publicly  generally  only  convinces  the  public  that  there  is 
something  to  hide  and  creates  opposition  that  can  block  beneficial 
innovations.  This  is  as  true  of  big  data  as  it  is  of  nuclear  power.  For  instance, 
the  InBloom  big  data  educational  innovation  was  terminated  after  parents 
voiced  fears  over  possible  secondary  uses  of  their  children’s  data  (Kharif 
2014).  When  professionals  are  primed  to  understand  and  raise  questions 
about  the  possible  downsides  of  a proposed  data  innovation,  better 
technology  uses  and  outcomes  are  possible  for  all. 

Among  the  reasons  for  preparing  data  scientists  to  understand  the 
broader  implications  of  their  work  are  the  following: 

• The  systems  that  feed  and  flow  from  data  analytics  are  often  highly 
complex,  drawing  data  from  numerous  sources  both  external  and 
internal  to  an  organization  and  involving  interconnections  among 
independently  developed  systems.  The  sheer  complexity  of  such 
systems  can  give  rise  to  unexpected  outcomes  and  glitches. 
Education  is  needed  to  anticipate,  prevent,  diagnose,  and  correct 
such  outcomes. 

• Relying  on  intuition  in  the  interpretation  of  analytics  results  can  lead 
to  serious  practical  errors.  Data  scientists  need  to  be  well  attuned  to 
the  sources  of  error  in  data  and  algorithms  and  to  human  reactions 
to  labeling,  subtle  guidance,  and  constraints  on  their  behavior. 
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• Individual  perspectives  and  value  judgments  regarding  the 
implications  of  analytics-based  systems  and  the  potential  that 
analytics  infrastructures  create  vary  quite  significantly.  Education 
can  help  analytics  professionals  leam  ways  to  put  personal  biases 
aside  and  arrive  at  a more  neutral  analysis  and  a more  successful 
resolution  of  complex  sociotechnical  situations. 

• Automated  decision-making  systems  that  operate  without 
continuous  human  involvement  can  amplify  the  negative 
consequences  of  flawed  data  analyses.  “Invisible  technical  workers” 
(Ribes  etal.  2013)  make  choices — both  at  the  time  of  original  design 
and  implementation  of  the  systems  and  during  system  operation 
(which  might  be  fully  integrated) — that  potentially  have  far- 
reaching  consequences  for  both  individuals  and  organizations. 
These  consequences  are  often  opaque  to  organizational  clients  and 
users.  In  some  cases,  even  technical  specialists  may  not  understand 
why  their  algorithms  produce  the  results  they  do.  Greater  awareness 
of  this  possibility  is  needed  to  produce  and  update  algorithms  that 
work  well,  to  devise  effective  ways  for  humans  to  intervene,  and  to 
give  affected  people  the  opportunity  for  redress  when  errors  are 
made. 

In  short,  realizing  the  potential  benefits  of  big  data  without  the  possible 
harms  requires  data  scientists  and  analysts  who  are  sensitive  to  data 
downsides  as  well  as  data  upsides. 

Knowledge  Areas  and  Pedagogical  Approaches 

We  believe  that  data  science  and  analytics  programs  need  modules 
and  course(s)  on  the  implications  and  potential  consequences  of  analytics. 
These  courses  must  be  specialized  to  the  particular  legal,  ethical  and  societal 
issues  raised  by  big  data.  For  example,  simply  adding  a course  on  business 
ethics  (which  may  cover  foreign  corrupt  practices,  abusive  labor  practices, 
and  environmental  damage)  to  a business  analytics  program  does  not 
rigorously  address  issues  like  data  protection  law,  personal  information 
privacy,  online  harassment,  or  glitches  in  online  trading.  Naturally,  ethical 
theories  and  general  principles  are  shared  across  contexts,  but  the  way  these 
principles  are  applied  requires  in-depth  understanding  of  the  dependencies 
and  connections  discussed  above. 
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It  is  beyond  the  scope  of  this  paper  to  present  a detailed  proposal  for 
courses  (or  course  modules)  on  the  legal,  ethical,  and  societal  implications 
of  analytics  and  data  science.  However,  we  propose  that  the  following 
knowledge  areas  should  be  covered: 

• Categories  of  major  implications  and  potential  consequences  of 
analytics-based  systems  (as  laid  out,  for  example,  in  the  framework 
proposed  by  Markus  and  Topi  2015) 

• Methods  for  identifying,  analyzing,  and  understanding  complex 
organizational  situations  from  multiple  perspectives  in  an  unbiased 
way,  particularly  from  the  perspective  of  implications  and  potential 
consequences  of  technology-enabled  systems 

• Rich  collections  of  relevant  real-world  examples  that  illustrate  the 
positive  impacts  of  thorough  implications  analysis 

• Material  that  allows  students  to  understand  themselves  as  ethical 
decision-makers.  It  is  particularly  important  that  the  students  have  a 
good  understanding  of  the  sources  of  potential  biases  in  data, 
algorithms,  and  decision-making  processes. 

Pedagogically,  the  modules  or  courses  required  for  developing  these 
competences  need  a good  balance  between  the  elements  that  build  a strong 
conceptual  foundation  and  those  that  apply  methods  of  active,  participatory 
learning.  It  is  essential  to  allow  students  to  internalize  the  theories  and  make 
personal  discoveries  through  case  analysis,  interviews  and  observations  in 
organizational  settings,  role  play,  games,  simulations,  and  other  similar 
pedagogical  approaches.  The  modules  also  need  exercises  that  help  the 
students  discover  their  own  value  positions  on  analytics  issues.  Many  of 
these  materials  do  not  currently  exist,  and  developing  them  should  be  an 
important  priority  for  the  data  science  and  analytics  community. 

Conclusion 

In  this  paper,  we  have  argued  that  every  analytics  and  data  science 
program  should  include  a significant  emphasis  on  the  implications  and 
potential  consequences  of  big  data.  This  can  be  accomplished  through  a set 
of  components  (or  a single  course)  that  provide  students  with  the 
opportunity  to  engage,  theoretically  and  experientially,  with  the  legal, 
ethical  and  societal  implications,  and  potential  consequences  of  big  data. 
Without  a systematic  approach  to  developing  these  competencies,  even 
highly  trained  and  technically  competent  experts  may  approach  their  work 
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with  perspectives  that  are  too  narrowly  focused  on  the  potential  benefits  of 
innovation  and  too  neglectful  of  potential  harms.  We  firmly  believe  that 
integrating  attention  to  implications  and  consequences  into  every  analytics 
and  data  science  program  will  lead  to  great  value  for  individuals, 
organizations,  and  society. 
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Abstract 

“Big  Data”  is  a relatively  new  term,  often  used  imprecisely  and  often  in  contexts 
that  imply  a pressing  need  for  workers  with  a newly-blended  unique  skillset. 
However,  social  scientists  have  worked  with  exceptionally  large  data  sets  for 
quite  some  time,  historically  accessing  remote  space,  writing  code,  analyzing 
data,  and  then  telling  stories  about  human  social  behavior  from  these  complex 
sources.  Therefore,  more  than  a half  century  of  accumulated  social  science 
knowledge  about  extracting  information  from  very  large  data  sets  to  understand 
human  social  behavior  provides  a model  for  the  emergent  data  science 
profession.  In  this  article  I present  analyses  of  current  and  projected  U.S. 
workforce  data  using  various  definitions  of  skillsets  for  data  scientists, 
concluding  with  a discussion  of  the  policy  implications. 

Introduction 

“Big  Data”  is  a relatively  new  term,  often  used  imprecisely.  Recent 
U.S.  science  and  technology  policy  panels,  including  the  President’s 
Council  of  Advisors  on  Science  and  Technology  (PCAST  2015)  and  a U.S. 
National  Academies  of  Science  study  group,  deploy  the  term  “big  data”  in 
a way  that  suggests  that  in  the  past,  the  issues  of  privacy,  worker  skills,  and 
access  were  not  encountered  and  are,  therefore,  new  and  in  need  of 
attention.  However,  social  scientists  have  worked  with  exceptionally  large 
data  sets  for  quite  some  time,  including  implementing  some  of  the  much- 
touted  benefits  of  big  data  such  as  merging  multiple  datasets  from  disparate 
sources  into  larger  files,  hierarchical  file  structures,  and  integrating 
quantitative  and  qualitative  sources  to  derive  insights  into  human  social 
behavior.  Hence,  the  methodological  practices,  ethical  guidelines,  data 
management,  and  statistical  methods  that  have  been  honed  in  the  social 
sciences'  provide  important  workforce  development  lessons  for  the  “new” 
big  data. 

Additionally,  depending  on  how  one  specifies  the  workforce  skills 
requirements  for  big  data,  the  size  of  the  existing  pool  of  talent  varies,  as 
does  the  prognosis  for  the  labor  market  fortunes  for  data  scientists  as  the 
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21st  century’s  “sexiest  job”  (Davenport  & Patil  2012,  RJMetrics  2015).  In 
this  article  I start  by  differentiating  the  new  big  data  as  a form  of  what 
Groves  (2011)  describes  as  “organic”  data,  in  contrast  to  the  traditional 
large  “designed”  datasets  that  have  been  / continue  to  be  collected,  analyzed 
and  reported  on  by  social  scientists.  Next  I will  present  analyses  of  current 
and  projected  U.S.  workforce  data  using  various  definitions  of  skillsets  for 
data  scientists.  Finally,  I close  by  discussing  the  policy  implications  about 
the  big  data  workforce. 

What  is  Big  Data  and  How  Does  it  Differ  from  Previous  Types  of 

Large  Datasets? 

The  term  “Big  Data”  typically  refers  to  what  Groves  of  the  U.S. 
Census  Bureau  has  referred  to  as  “organic”  data  (2011),  in  contrast  to 
“designed  data.”  As  these  terms  imply,  designed  data  are  the  traditional  raw 
material  deployed  by  social  scientists  to  answer  research  questions  with 
carefully  designed  studies  using  tested  and  accepted  methodologies  to 
advance  knowledge  of  the  social  world.  Organic  data  are  observational  data 
generated  by  the  day-to-day  behaviors  of  people.  Social  scientists  also 
gather  organic  data,  but,  again,  such  data  collections  are  designed  as 
opposed  to  the  data  mining  approach  associated  with  the  new  big  data. 

Social  scientists  from  many  fields  have  worked  with  very  large 
datasets.  The  computing  power  available  in  personal  computers  now  far 
exceeds  the  capabilities  of  these  machines  just  two  decades  ago.  Those  who 
worked  with  very  large  files  such  as  those  from  the  U.S.  Census,  the  Panel 
Study  of  Income  Dynamics,  or  the  Department  of  Education’s  High  School 
and  Beyond  longitudinal  study  program  (to  name  just  a few),  often  accessed 
tape  or  cartridge-stored  files  using  mainframe  operating  systems  such  as  the 
IBM  VAX.  Researchers’  individual  accounts  at  colleges  and  universities 
were  typically  insufficient  in  size  to  permit  storage  and  analysis, 
necessitating  that  social  scientists  who  were  “quants”  learn  how  to  access 
and  assign  virtual  space  on  which  to  park  files  and  then  perform  statistical 
analyses  using  one  of  a number  of  statistical  packages  like  BMDP,  SPSS, 
or  SAS  (among  others).  Additionally,  social  scientists  needed  to  learn  the 
command  syntax  for  these  specialized  packages,  therefore,  rudimentary 
programming  skills  were  also  a critical  element  of  quantitative  social 
scientists’  training. 
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In  the  1970s- 1990s,  quantitative  social  scientists’  graduate 
programs  typically  included  advanced  courses  in  research  design,  including 
statistical  analysis,  but  often  did  not  emphasize  visualizations,  which  were 
not  as  commonly  deployed  at  that  time  in  peer-reviewed  research  literature. 
Technical  skills  such  as  programming  and  command  syntax  were  easily 
learned  via  workshops  offered  by  campus  computer  centers,  basic  classes 
offered  by  computer  science  or  management  information  science  programs, 
or  programming  manuals.  The  academic  program  focused  on  the  substance 
and  design  issues  that  provided  a foundation  for  the  analyses  that  social 
scientists  would  perform  on  the  large  datasets.  Quantitative  social 
scientists — in  accordance  with  the  deductive  scientific  method — started 
with  ideas  and  then  located  the  appropriate  data,  which  could  be  used  to 
answer  research  questions. 

With  the  substantial  increases  in  desktop  (and  laptop)  computing 
power,  many  social  scientists  now  have  the  luxury  of  being  able  to  store  and 
analyze  many  of  these  same  datasets  on  a local  machine  rather  than 
mounting  tapes  and  using  remote  computers.  Additionally,  many  of  the 
popular  statistics  packages  developed  windows-based  products,  which 
enable  many  quantitative  social  scientists  today  to  more  easily  manipulate 
and  analyze  data  without  learning  complicated  command  syntaxes. 

Turning  to  a consideration  of  the  implications  of  big  data  with 
respect  to  observational  data,  three  features  now  make 
organic/observational  data  “big”:  velocity,  variety,  and  volume,  the  three 
V’s  (McAfee  and  Brynjolfsson  2012;  De  Mauro  et  al.  2015).  Table  1 
compares  social  science  designed  observational  data  and  organic  “big”  data 
on  the  three  Vs.  In  a nutshell,  while  the  variety  of  designed  and  organic  data 
are  vast,  the  large  organic  designed  data  used  by  social  scientists  are  orders 
of  magnitude  smaller  than  organic  big  data.  Information  derived  from 
designed  data  requires  more  time  to  emerge  than  the  pace  with  which  the 
insights  from  organic  data  are  demanded  in  business  settings  for  data-driven 
decision  making. 
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Table  1.  Observational  Data  - Designed  and  Organic  (i.e.,  Big) 


Factor 

Social  Science  - 
Designed  Observational 
Studies 

Big  Data  - 
Organic  Data 

Velocity 

Slow:  once  a study  is  designed, 
observational  data  are  collected, 
sometimes  using  paper-and-pencil, 
sometimes  using  recording  devices 
(audio  and/or  video  as  appropriate), 
and  sometimes  using 

minicomputers  of  various  types. 
Once  these  data  are  gathered, 
analysis  with  qualitative  analysis 
packages  for  the  social  sciences 
requires  content  to  be  manually 
tagged,  sometimes  by  multiple 
coders. 

Fast:  with  the  diffusion  of  smart 
phone  technologies  and  video 
surveillance,  individuals’  behaviors 
are  able  to  be  gathered  instantly 
within  a medium  that  may  be  used 
to  analyze  and  influence  subsequent 
behaviors. 

Variety 

In  a designed  study,  scope  is 
limited  by  the  particular  research 
question,  time  constraints,  and 
technology  associated  with 

gathering  relevant  data.  Though 
limited  within  a particular  study, 
observational  researchers  have 
studied  a dizzying  array  of  topics. 

Large  technology  companies — 
including  those  that  develop 
applications  for  smart  phones — can 
access  many  types  of  data  about 
many  behaviors  of  individuals. 
Smartphone  apps,  access  location 
information  and  are  able  to  track 
individual  preferences  about 

entertainment,  transportation 

choices,  eating  and  exercise  habits, 
and  buying  patterns  (among  others). 

Volume 

Generally  megabyte-order  of 
magnitude. 

“Big”  is  a misnomer  for  the 
petrabyte-sized  files  that  are  floating 
in  cyberspace. 

Many  of  the  techniques,  tools,  and  protocols  developed  by  social 
science  research  communities  to  manage  and  share  large  designed 
datasets — including  attention  to  the  ethical  issues  associated  with  collecting 
these  data — hold  important  implications  for  the  big  data  workforce.  Since 
1962  the  Interuniversity  Consortium  for  Political  and  Social  Research 
(ICPSR)  at  the  University  of  Michigan  Institute  for  Social  Research,  has 
provided  a repository  for  large  social  science  datasets.  ICPSR  curates  these 
data,  provides  support  to  social  scientists  via  training  sessions  in 
quantitative  methods,  and  has  long-established  and  continuously  evolving 
standards  for  the  storage  and  use  of  data  with  attention  to  the  issues  of 
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confidentiality  and  privacy.  A 2014  PCAST  report  on  privacy  issues  and 
big  data  recognized  the  potential  value  that  could  be  added  by  the  social 
sciences  in  its  third  recommendation: 

With  coordination  and  encouragement  from  OSTP,  the 
NITRD  agencies  should  strengthen  U.S.  research  in  privacy- 
related  technologies  and  in  the  relevant  areas  of  social 
science  that  inform  the  successful  application  of  those 
technologies.  (PCAST  2014:  xiii) 

The  fourth  recommendation  indicates  a need  to  incorporate 
education  about  privacy  issues  into  the  education  and  training  of 
professionals  who  work  with  big  data.  Social  science  research  methods 
classes — typically  required  courses  at  both  the  undergraduate  and  graduate 
levels — cover  these  issues.  Additionally,  online  training  platforms2 
designed  to  educate  and  certify  social,  life,  and  medical  scientists  in  issues 
associated  with  human  subjects  in  research  represent  another  mechanism  by 
which  those  in  disciplines  that  engage  with  big  data-computer  scientists, 
marketing  researchers,  and  machine  language  programmers-could  be 
educated  in  the  complex  issues  associated  with  protection  of  individuals’ 
privacy. 

Workforce  Considerations  - What  are  the  Skills  Needed  for  “Data 

Scientists?” 

Estimates  of  the  needs  for  the  big  data  workforce  vary  widely 
because  the  skillset  is  a somewhat  moving  target.  Starting  in  2009, 
Hammerbacher  suggested  that  the  new  big  data  required  a new  occupation, 
the  data  scientist,  who,  at  Facebook,  would  be  able  to  use  a variety  of 
programming  skills — Hadoop,  R,  and  Python — to  access  space  for  the  new 
huge  datasets  and  complete  analyses.  Similar  emphasis  on  programming 
skills  and  alignment  with  computing  and  information  technology  (IT) 
disciplines  was  reflected  in  a 2010  PCAST  report  on  Networking  and 
Information  Technology  (NIT)  Research  and  Development  (NITRD): 

NIT  is  the  dominant  factor  in  America’s  science  and 
technology  employment,  and  that  the  gap  between  the 
demand  for  NIT  talent  and  the  supply  of  that  talent  is  and 
will  remain  large.  Increasing  the  number  of  graduates  in  NIT 
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fields  at  all  degree  levels  must  be  a national  priority. 
Fundamental  changes  in  K-12  education  are  needed  to 
address  this  shortage.  (PCAST  2010:  85) 

Other  sources  in  2010  and  2011,  though,  indicate  additional  skills 
beyond  the  technical  computing  skills  cited  by  PCAST.  A 2010  Economist 
article  reported  on  the  new  profession,  suggesting  that  data  scientists 
“combine  the  skills  of  software  programmer,  statistician,  and 
storyteller/artist  to  extract  nuggets  of  gold  hidden  under  mountains  of  data.” 
In  2011,  McKinsey  reported  that  the  skills  needed  to  exploit  big  data  were 
so  disparate,  that  three  types  of  workers  would  be  needed: 

Our  research  identifies  three  key  types  of  talent  required  to 
capture  value  from  big  data:  deep  analytical  talent — people 
with  technical  skills  in  statistics  and  machine  learning,  for 
example,  who  are  capable  of  analyzing  large  volumes  of  data 
to  derive  business  insights;  data-savvy  managers  and 
analysts  who  have  the  skills  to  be  effective  consumers  of  big 
data  insights — i.e.,  capable  of  posing  the  right  questions  for 
analysis,  interpreting  and  challenging  the  results,  and 
making  appropriate  decisions;  and  supporting  technology 
personnel  who  develop,  implement,  and  maintain  the 
hardware  and  software  tools  such  as  databases  and  analytic 
programs  needed  to  make  use  of  big  data.  (Manyika  et  al. 

2011:  103) 

As  described  above,  these  same  skills  are  akin  to  those  of 
quantitative  social  scientists  who  used  programming  skills  to  manipulate 
data  and  perform  statistical  analyses  to  extract  information  from  very  large 
datasets.  Everything  old  is  new  again:  data  science  is  a new  version  of 
quantitative  social  science,  but  without  the  research  foundation  in  human 
behavior  and  the  ethical  standards  of  the  social  sciences.  It  is  important  to 
recognize,  however,  that  much  of  social  science  work  with  large  datasets  is 
basic  research  (i.e.,  fundamental  knowledge  creation),  while  the  extraction 
of  information  from  big  data  is  applied  research  (i.e.,  enabling  data-driven 
decision  making  in  work  settings). 

Most  recently,  however,  some  observers  of  the  emergent  data 
scientist  profession  are  less  optimistic  about  the  future  of  this  occupation. 
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Darrow  (2015)  concludes,  “Enjoy  your  fat  salaries  while  you  can  data 
scientists,  because  the  rising  tide  of  new  talent  and — gasp — automation  will 
take  their  toll.”  The  same  Fortune  article  quotes  Alex  Cosmas  of  Booz  Allen 
Hamilton  (BAH),  who  indicates  that  BAH  hires  analysts  and  then  trains 
them  in  the  technical  skills  of  data  science:  “We  look  for  raw 
inquisitiveness,  the  intellectual  curiosity  which  will  repay  you  tenfold.”  As 
in  the  past,  the  pool  from  which  data  scientists  will  be  drawn  is  broader  than 
the  pool  of  those  trained  in  computing  or  information  technology. 

What  is  the  Size  of  and  Trends  in  the  U.S.  Data  Science  Workforce? 

There  is  no  question  about  the  proliferation  of  new  occupations 
associated  with  computing  and  the  burgeoning  size  of  the  information 
technology  (IT)  workforce.  The  ubiquity  of  computing  technology  has 
created  the  need  for  a host  of  workers  in  occupations  that  did  not  exist  two 
decades  ago.  The  rapidity  with  which  demand  for  workers  with  IT  skills  as 
well  as  the  variety  of  such  skills  have  resulted  in  a number  of  mechanisms 
by  which  workers  obtain  these  skills  and,  as  well,  how  employers  obtain  the 
skilled  workers  they  need.  A recent  report  by  RJMetrics  (2015)  used  data 
from  Linkedln  to  estimate  the  number  of  data  scientists  (worldwide)  to  be 
1 1,400,  many  of  whom  held  advanced  degrees. 

Figure  1 shows  U.S.  Bureau  of  Labor  Statistics  (BLS)  projections 
for  growth  in  a number  of  science  and  engineering  occupations  for  the  20 1 2- 
2022  period  along  with  the  actual  growth  in  these  occupations  in  the 
previous  ten-year  period  (2004-2014).  Between  2004  and  2014,  the  number 
of  jobs  in  the  U.S.  economy  grew  by  5.1  percent,  with  substantially  more 
rapid  growth  in  computing  and  mathematical  sciences  occupations, 
including  those  associated  with  software  development,  which  grew  by  37 
percent  and  27  percent,  respectively.  Architecture  and  engineering 
occupations  barely  grew,  with  a substantial  contraction  in  hardware 
engineering.  Projections  of  growth  for  the  2012-2022  decade  match  this 
pattern  across  occupations,  with  the  most  substantial  growth  projected  for 
computing  and  mathematical  sciences,  especially  software,  both  outpacing 
the  10.8  percent  projected  growth  for  the  overall  number  of  jobs. 
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U.S.  Labor  Force  Growth  - Actual  and  Projected 

□ Actual  Growth,  2004-2014  ■ Projected  Growth,  2012-2022 
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Note:  Software  includes  computer  programming  and  software  engineering,  a subset  of 
"Comp./Math.  Sci.;  Hardware  includes  Computer  Hardware  Engineers,  a subset  of  "Arch.  & Eng." 


Figure  1.  Historical  and  Projected  Demand  for  IT  Workers 

Source:  Analysis  of  data  from  the  Bureau  of  Labor  Statistics,  2014.  “Table  1.2  Employment  by 
detailed  occupation,  2012  and  projected  2022  (Numbers  in  thousands)”  and  Current  Population 
Survey  AAT-series  Table  1 1 for  2004-2014. 


The  plots  in  Figure  2 provides  another  way  to  understand  the  past 
decade  of  change  in  these  technical  occupations  in  comparative  perspective. 
For  both  median  weekly  earnings  of  full  time  workers  and  the  overall 
number  of  workers  in  each  occupational  category,  a change  ratio  was 
computed  as  follows: 


Change  ratio 


(Epcc,  2014  Epcc,  2004) 

Eocc,  2004 

(Etotal, 2014  ~ Etotal, 2004) 

Etotal,  2004 


The  change  ratio  was  computed  for  four  categories  of  occupations 
(denoted  as  occ  in  the  subscripts  in  the  equation):  computer  and 
mathematical  sciences,  architects  and  engineers,  software,  and  hardware. 
The  x-axis  shows  the  employment  change  ratio,  while  the  y-axis  plots  the 
change  ratio  of  median  weekly  earnings  for  these  same  four  occupations.  A 
change  ratio  of  unity  would  indicate  that  the  change  in  the  specific 
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occupation  is  on  par  with  that  in  the  rest  of  the  economy,  while  ratios  under 
one  show  slower  change  and  ratios  greater  than  one  indicate  faster  growth. 
As  shown  in  the  Figure  2,  only  growth  in  software  occupations  outpaced 
the  U.S.  in  both  numbers  and  median  weekly  earnings,  while  computer  and 
mathematical  sciences  outpaced  the  U.S.  in  earnings  but  not  numbers.  Both 
architects  and  engineers,  including  computer  hardware  engineers,  grew 
slower  than  the  U.S.  economy  over  the  same  decade  in  terms  of  both 
earnings  and  number  of  workers. 
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Figure  2.  Earnings  and  Employment  Change  Ratios,  2004-2014 

C/MS:  Computer  and  mathematical  sciences 

Software:  Software  developers,  applications,  and  systems  software 

Hardware:  Computer  hardware  engineering 

Arch  & Eng:  Architects  and  engineers 

Source:  Analysis  of  data  from  the  Bureau  of  Labor  Statistics,  Current  Population  Survey  AAT- 
series,  Table  1 1 2004  and  2014  (employment)  and  Table  39  2004  and  2014  (Median  weekly 
earnings  for  full  time  wage  and  salary  workers). 
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Figures  1 and  2 show  data  only  for  the  IT  occupations  associated 
with  big  data.  It  should  be  noted,  though,  that  academic  discipline  silos 
complicate  definition  of  the  big  data  workforce.  On  the  one  hand,  as 
described  above,  quantitative  social  scientists  who  work  with  very  large 
data  sets  to  generate  new,  basic  science  knowledge  of  human  social 
behavior  do  not  use  the  term  “big  data”  to  describe  the  work  that  they  do. 
Computer  scientists  have  embraced  the  term  “big  data.”  Online  forums 
emphasize  the  primacy  of  programming,  which  reinforces  a professional 
boundary  on  the  skills  associated  with  accessing  and  analyzing  these 
organic  data.  The  popular  pej oration  of  social  science — clearly 
demonstrated  by  U.S.  Congress  members’  frequent  attacks  on  social  science 
projects  funded  by  the  National  Science  Foundation,  for  example — 
reinforces  this  barrier.  The  emphasis  on  technical  programming  skills  and 
algorithm  development  has  been  suggested  as  a replacement  for  the  theory 
development  process  with  respect  to  social  data,  with  one  observer  claiming 
that  “the  data  deluge  makes  the  scientific  method  obsolete.”  (Anderson 
2008). 


Conclusion 

Big  data  has  abundant  applications  for  business,  health,  and  finance. 
The  ability  to  rapidly  analyze  exceptionally  large  data  sets  from  multiple 
sources  to  provide  information  to  enable  actions  in  real-time  offers  promise 
in  a range  of  areas.  For  example,  big  data  may  enable  more  precise  dosing 
of  medications  and  has  been  used  to  develop  sensor  technology  to  determine 
when  a football  player  needs  to  be  side-lined  because  of  his/her  heightened 
risk  of  concussion.  Consumers  may  experience  more  efficient  service  and 
process  efficiencies  may  yield  lower  prices  for  consumers  and  higher  profits 
for  businesses. 

The  emergence  and  evolution  of  the  data  science  occupation  bears 
on-going  scrutiny.  In  just  a few  years,  employers  have  seen  the  value 
associated  with  a cadre  of  workers  who  have  both  technical  skills  as  well  as 
the  ability  to  tell  a story  with  data.  However,  as  noted  by  BAH’s  Cosmas, 
locating  inquisitive  analysts  and  then  training  them  up  in  the  technical  skills 
may  be  the  likely  direction  that  will  be  taken  with  this  workforce.  In  this 
case,  the  potential  recruitment  pool  is  far  wider  than  graduates  of  computer 
science  programs  and,  indeed,  computer  science  programs  will  need  to 
provide  students  with  experiences  that  encourage  inquisitiveness  about 
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human  social  behavior  and  with  robust  training  about  privacy  and 
confidentiality  on  par  with  that  in  social  science  methods. 

There  are  many  potential  benefits  that  may  be  derived  from  cross- 
pollination  between  computing,  on  the  technical  side,  and  social  sciences, 
on  the  substantive  side,  to  deploy  big  data  as  a tool  for  human  advancement 
beyond  capitalist  accumulation.  Both  sets  of  fields,  however,  need  to  be 
wary  of  professional  boundary  heightening,  which  introduces 
inefficiencies.  Time,  energy,  and  effort  are  needed  to  develop  data  science 
as  a truly  transdisciplinary  field  that  can  yield  both  an  advancement  of  basic 
science  knowledge  about  human  social  behavior  as  well  as  applied  science 
information  for  data-driven  decision  making  in  real  world  contexts.  There 
is  a place  in  such  a transdisciplinary  field  for  both  designed  and  organic 
data,  the  latter  of  which  may  be  more  effectively  translated  into  information 
when  there  is  thoughtful  consideration  of  research  questions,  the  literature 
that  informs  those  questions,  and  use  of  previously  developed  analytical 
methodologies.  Better  translation  of  social  science  research  into  actionable 
information  may  help  diminish  the  challenges  of  its  relevancy  that  have 
plagued  public  funding  of  social  science. 

In  the  1970s- 1990s,  inquisitive  social  science  practitioners 
demonstrated  that  the  secrets  of  accessing  and  analyzing  very  large  datasets 
were  relatively  easy  to  acquire;  the  current  trends  in  big  data  analytics 
suggest  this  to  be  similar  with  respect  to  data  science  now.  While  the 
volume  and  velocity  of  basic  research  in  the  social  sciences  is  smaller  and 
slower  than  in  big  data,  the  same  variety  of  data  sources  and  implications 
for  data  quality -i.e.,  validity  and  reliability-are  similar.  So  everything  old 
is  new  again;  more  than  a half  century  of  accumulated  social  science 
knowledge  about  extracting  information  from  very  large  data  sets  to 
understand  human  social  behavior  provides  a model  for  the  emergent  data 
science  profession. 
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Endnote 


1 Social  sciences  are  taken  in  a broad  sense  to  include  fields  categorized  as 
such  by  the  National  Science  Foundation  (e.g.,  anthropology,  sociology, 
psychology,  political  science,  and  economics)  as  well  as  fields  that 
deploy  similar  methods  such  as  marketing  and  educational  research. 

2 These  platforms  include:  the  Collaborative  Institutional  Training 
Initiative  at  https://www.citiprogram.org/:  the  National  Institutes  of 
Health  Protecting  Human  Research  Participants  at 
https://phrp.nihtraining.com/users/login.php;  and  the  FHI360  Research 
Ethics  Training  Curriculum  at 

http://www.flri360.org/sites/all/libraries/webpages/fl~ii-retc2/index.html. 
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Abstract 

The  educational  system  involves  a complex  set  of  actors,  including  learners, 
parents,  teachers,  and  administrators.  However,  we  now  have  more  data  than  ever 
to  analyze  this  system,  which  could  result  in  a quick  understanding  and  evaluation 
of  public  policies  in  this  complex  policy  area.  This  paper  explores  a new  area  of 
data  about  the  educational  experience,  namely  social  media  data.  This  paper 
outlines  an  exploratory  analysis  of  the  Twitter  discussions  regarding  higher 
education  in  the  USA.  Based  on  a collection  of  more  than  1.5  million  tweets  over 
a period  of  4 months,  we  identify  a few  key  issues  in  the  current  higher  education 
discourse  on  social  media.  We  also  identify  the  effect  of  the  expressed  feelings  of 
the  social  media  users  when  it  comes  to  college  applications,  decisions  and 
completion.  We  conclude  that  policies  in  higher  education  can  be  better  tailored 
if  they  are  informed  by  social  media  discussions. 

Introduction 

The  increasing  amount  of  data,  the  decreasing  cost  of  computational 
power,  and  the  improving  state  of  analytics  has  revolutionized  fields  from 
stock  trading  to  social  analytics,  but  somehow  higher  education  has  not 
received  as  much  attention.  The  technology  that  has  transformed  many  for- 
profit  businesses  and  governments  can  be  applied  at  various  colleges  and 
universities. 

One  obvious  place  that  analytics  could  be  useful  is  in  the  classroom, 
but  currently  instructors  at  many  universities  are  using  outdated  and 
inefficient  methods  to  grade  assignments  and  compile  these  scores  into  self- 
generated databases.  In  fact,  Darnell  West  argues  that  “many  of  the  typical 
pedagogies  provide  little  immediate  feedback  to  students,  require  teachers 
to  spend  hours  grading  routine  assignments,  are  not  very  proactive  about 
showing  students  how  to  improve  comprehension,  and  fail  to  take 
advantage  of  digital  resources  that  can  improve  the  learning  process”  (West 
2012).  Data  mining  and  analytics  provide  the  capabilities  necessary  to 
circumvent  the  traditionally  cumbersome  grading  processes  and  glean 
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insights  from  student  data  about  performance,  learning  approaches,  and 
other  metrics.  For  example,  Leah  Macfadyen  and  Shane  Dawson  developed 
an  “early  warning  system”  which  correctly  identified  81%  of  students  who 
failed  an  online  course  by  creating  a regression  model  that  analyzed  such 
variables  as  total  number  of  discussion  messages  posted  and  total  number 
of  assignments  completed  (Macfadyen  and  Dawson  2010). 

Big  data  analytics  within  education  could  also  be  used  to  monitor 
student  progression  through  various  course  sequences  for  specific  majors, 
online  courses  that  change  activities  by  measuring  everything  from 
individual  clicks  to  aggregate  performance  and  algorithms  that  suggest 
courses  a student  should  take  by  analyzing  her  past  grades  in  similar  courses 
(Bienkowski  etal.  2012).  While  traditional  in-person  classrooms  may  allow 
for  the  collection  of  big  data  for  these  applications,  Anthony  Picciano  notes 
“to  move  into  the  more  extensive  and  especially  time-sensitive  learning 
analytics  applications,  it  is  important  that  instructional  transactions  are 
collected  as  they  occur”  (Picciano  2012).  This  rapid  collection  of  data  is 
most  likely  to  be  facilitated  by  course  management/leaming  management 
system  architectures  and  online  and  blended  learning  course  structures 
(Worsley  2012). 

There  is  little  work  that  has  looked  at  how  to  use  analytics  methods 
outside  the  classroom  to  improve  the  overall  educational  ecosystem,  as  well 
as  educational  policy.  However,  insights  produced  by  the  previously 
described  learning  analytics  systems  can  also  be  used  to  inform  policy 
decisions.  According  to  van  Bameveld,  Arnold,  and  Campbell  (2012), 
“Like  business,  higher  education  is  adopting  practices  to  ensure 
organizational  success  at  all  levels  by  addressing  questions  about  retention, 
admissions,  fund  raising,  and  operational  efficiency”.  Michael  Horn  and 
Katherine  Mackey  (2011)  suggest  that  education  analytics  can  be  used  to 
shift  the  focus  from  inputs  to  outputs  when  measuring  academic 
institutional  success.  Instead  of  using  seat-time,  faculty-student  ratios,  and 
dollars  spent  as  a measure  of  success,  analytics  software  can  provide 
information  on  more  appropriate  metrics  such  as  student  performance  and 
retention  rates.  The  biggest  obstacles  to  establishing  more  such  systems  are 
building  data  sharing  networks  where  these  myriad  metrics  can  be 
aggregated,  holistically  analyzed,  and  shared  among  different  institutions 
(West  2012).  A recent  paper  proposes  a model  and  algorithm  that  would 
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help  prospective  students  make  better  informed  decisions  about  the  best  fit 
and  best  college  eco-system  based  on  their  unique  personalities  and 
behaviors  (Berea  et  al.  2015). 

Text  mining,  social  media,  or  sentiment  analysis  on  the  college 
decision  process  has  generally  not  been  discussed  in  education  analytics 
literature  and  therefore  presents  an  interesting  opportunity  to  further 
advance  research  in  this  area.  A recent  survey  by  Piper  Jaffray  found  that 
teens  are  abandoning  Facebook  in  favor  of  Instagram;  76%  of  teens  are  on 
Instagram  and  they  are  using  it  to  gain  an  unfiltered  look  at  colleges 
(Stampler  2015). 


Data  Analysis 

We  collected  data  for  this  education  analytics  project  for  a period  of 
4 months,  between  March  4th  and  July  1st,  2015.  For  this  collection  we  used 
TwEater,  an  original  and  proprietary  collection  tool  developed  at  the 
University  of  Maryland  (TwEater  2015).  Originally,  the  collection  was 
based  on  57  keywords  and  hashtags,  such  as:  “igotin”,  “college”,  “campus”, 
“acceptanceletter”,  and  many  more,  and  the  original  data  set  comprised 
more  than  10  million  tweets.  Since  most  of  these  keywords  were  not 
necessarily  related  to  the  idea  of  higher  education  and  college  admissions 
and  applications,  we  selected  a list  of  25  hashtags  pertaining  exclusively  to 
college,  high  school  and  higher  education.  Out  of  these,  only  20  rendered 
more  than  a tweet,  with  a minimum  of  one  tweet  for  the  hashtag 
#choosingacollege  and  a maximum  of  1,153,618  tweets  for  the  hashtag 
#college  followed  by  282,139  tweets  for  the  hashtag  #campus  (see  Table  1). 

On  the  basis  of  this  collection,  we  assembled  a data  set  of  1,523,817 
tweets  where  most  of  them  (73%)  refer  to  the  general  idea  of  “college”. 
Many  of  these  tweets  are  quite  general,  but  some  of  them  focus  on  specific 
issues,  such  as:  making  college  applications  friendlier  for  the  LGBT 
community,  businesses  supporting  campuses,  parent-student  conflicts  in 
college  decision  making,  and  hard  college  choices  between  various  schools. 

Text  Mining 

Based  on  this  collection,  we  built  a dictionary  of  about  470,000 
unique  words  that  are  specific  to  the  discourse  about  higher  education  in  the 
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USA.  This  is  a dictionary  roughly  half  the  size  of  the  English  language  (the 
Oxford  English  Dictionary  has  over  600,000  words  alone),  with  the  caveat 
that  some  of  the  words  in  our  dictionary  are  informal  or  abbreviations  or 
pronouns  that  may  not  be  currently  recognized  as  being  part  of  the  formal 
English. 

The  most  frequent  words  in  the  education  discourse  are  “campus” 
and  “college”,  but  if  we  leave  these  obvious  terms  aside,  words  such  as 
“highschool”,  “acceptance”,  “life”,  and  “met”  are  highlighted  as  the  most 
frequent  ones  that  are  not  directly  related  to  colleges.  This  gives  us  an 
indication  that  students  do  talk  about  college  acceptance,  life,  and  college 
related  meetings  on  Twitter. 

We  also  analyzed  each  of  the  20  keywords  separately  and  created  a 
histogram  of  word  frequency  for  each  of  the  20  keywords.  After  “college”, 
“campus”,  “higher  education”  and  “highschool”,  the  largest  corpuses 
(indicated  by  the  number  of  tweets)  belong  to  hashtags  such  as 
#collegeopportunity,  #collegetour  and  #collegebound.  The  second  most 
frequent  word  in  most  corpuses  is  “student”.  Some  interesting  words,  which 
are  sparse  (low  frequency)  but  appear  more  than  once  and  are  associated 
with  the  most  frequent  terms  mentioned  above,  are  terms  such  as:  “success”, 
“community”,  “hard”,  “chip”  and  “app”.  There  is  a very  large  gap  between 
the  most  frequent  words  and  the  second  most  frequent  words  (showing  the 
long  tail  distribution  of  the  words)  (see  Table  1). 

Twitter  only  allows  for  a fixed  number  of  characters  per  tweet, 
therefore  we  also  checked  how  many  unique  words  are  being  used  in  a tweet 
in  our  data:  #collegedecision,  #collegechoice  and  #backtocollege  have  the 
most  “rich”  tweets  (an  average  of  ~7  words  per  tweet),  while 
#collegeopportunity  has  the  least  number  of  unique  words  per  tweet  (an 
average  of  0.2),  probably  due  to  a different  type  of  content  used  in  the  tweet 
(z.e.,  hyperlink  or  video)  (see  Table  1). 
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Table  1.  The  summary  statistics  for  higher  education  Twitter  data 


Number 
of  tweets 

Highest 

frequency 

Second 

highest 

frequency 

Absolute 

sentiment 

score 

Corpus 
size  (no. 
unique 
words) 

Words 

per 

tweet 

Min 

6 

2 

2 

-461647 

17 

0.2072 

1st 

Quart 

14.8 

10.25 

3.75 

3.8 

91.5 

1.2242 

Median 

198 

155.5 

50.5 

37.5 

515 

3.1876 

Mean 

76190.9 

20985.15 

3399.55 

-34069.8 

37949.7 

3.2889 

3 rd 

Quart 

2562.2 

838.5 

151.25 

414.8 

2727.2 

4.5682 

Max 

1153618 

193802 

48383 

1945 

469748 

7.3636 

Sentiment  Analysis 

We  matched  the  words  in  each  of  the  20  dictionaries  with  the 
AFINN  standard  sentiment  dictionary  (Nielsen  2011)  and  calculated  the 
sentiment  scores  of  the  tweets  in  our  data  (see  Figure  1).  The  AFINN 
dictionary  uses  a scale  from  -5  to  +5  to  rate  the  effect  of  approximately  2000 
words.  We  calculated  both  the  absolute  and  the  weighted  scores  for  each 
keyword.  The  absolute  scores  show  that  the  first  largest  corpuses 
(“campus”,  “college”  and  “highschool”)  are  also  strongly  negative,  while 
all  the  rest  are  positive  (with  the  exception  of  #backtocollege,  where  the 
absolute  score  is  only  -1,  close  to  neutral,  and  #collegedecision,  which  is  0). 
However,  raw  sentiment  scores  do  not  take  into  account  the  volume  of  the 
tweets  for  each  keyword.  We  therefore  examine  weighted  sentiment  scores 
- on  corpus  size  and  on  tweet  - since  the  distributions  of  the  corpus  sizes 
and  number  of  tweets  are  quite  skewed.  The  weighted  sentiment  scores 
show  that  #highschool  is  the  most  negative  talk  on  Twitter,  while 
#collegematch  and  #collegeopportunity  are  the  most  positive  ones. 
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Absolute  and  Weighted  Sentiment  Values  for  Each  Keyword 


0) 

(/) 


Figure  1.  Sentiment  values  for  each  keyword. 

2.3.  ZipPs  and  power  law  distributions 

Zipf  s law  is  a well-known  statistical  regularity  observed  in  natural 
language  (Zipf  1949)  that  states  that  the  frequency  of  any  word  is  inversely 
proportional  with  its’  rank  in  the  frequency  table.  We  tested  whether  the 
Zipf  law  holds  for  each  of  the  20  corpuses  and  found  that  #highschool, 
#highereducation,  and  #collegetalk  have  distributions  similar  to  the  Zipf 
distribution  (power  of  ~ -1),  while  #backtocollege,  #rightcollege,  and 
#collegecompletion  show  the  farthest  departures  from  the  Zipf  distribution 
(with  power  of  ~ -0.3)  (see  Figure  2). 


Power  law  distributions  in  college  Twitter  talk 


Log  Rank 


Figure  2.  Power  law  and  Zipf  distributions  of  words. 


Washington  Academy  of  Sciences 


69 


One  way  to  interpret  this  result  (backed  by  the  size  of  the  corpuses, 
as  well)  is  that  there  is  more  actual  discussion  involved  in  general  topics 
about  high  school  and  college  education  as  opposed  to  topics  about  college 
completion  or  matching,  where  the  Twitter  activity  is  more  likely  to  inform 
with  links  and  other  types  of  information  as  opposed  to  offering  opinions 
and  personal  insights  and  affect.  There  is  no  explanation  today  for  why 
Zipf  s law  is  characteristic  to  human  language,  but  some  prior  research 
suggests  that  this  distribution  is  more  characteristic  to  natural  language  and 
the  human  memory  of  language  (Cohen  et  al.  1997;  Piantadosi  2014). 
Therefore  tweets  that  contain  other  type  of  content  than  words  are  less  likely 
to  exhibit  this  pattern. 

Retweets 

Re  tweets  in  any  Twitter  data  are  one  way  to  measure  the  degree  of 
popularity  of  certain  tweets.  In  our  data  the  retweets  to  tweets  ratio  is  quite 
high.  The  two  keywords  with  the  highest  retweet  to  tweet  ratio, 
“rightcollege”  and  “collegeopportunity”,  had  retweeting  activity  for  almost 
each  and  every  tweet  — 0.933  and  0.903  respectively  - but  this  is  due  to  the 
majority  of  the  tweets  with  “collegeopportunity”  that  are  initiated  by  the 
users  of  @WhiteHouse  and  @BarackObama,  which  are  popular  and 
frequently  retweeted. 

Disregarding  these  outliers,  the  two  keywords  with  the  highest 
retweet  to  tweet  ratio  are  “highered”  and  “collegebound”  at  0.484  and 
0.468,  respectively  - almost  half  of  the  tweets  being  retweeted.  The  high 
retweet  to  tweet  ratio  of  “collegebound”  provides  an  interesting  insight  in 
the  context  of  this  project.  It  indicates  that  many  high  school  seniors  revert 
to  Twitter  to  broadcast  their  accomplishments  to  friends,  who  share  the 
congratulatory  experience.  This  conclusion  is  supported  by  a reading  of  the 
tweets.  Many  of  the  keywords  with  high  absolute  numbers  of  tweets  also 
have  moderately  high  retweet  to  tweet  ratios,  namely  “highschool,” 
“campus,”  and  “college”  at  0.443,  0.399,  and  0.356  respectively. 

Conclusion 

Our  analysis  is  constrained  to  only  about  4 months  of  collection  and 
a short  list  of  keywords.  But  even  so,  our  findings  show  the  following:  there 
is  generally  a negative  sentiment  regarding  colleges,  campuses,  high  school 


Fall  2015 


70 


and  higher  education;  there  is  a tension  between  students  and  parents  with 
respect  to  college  decisions;  campuses  and  colleges  are  being  judged  with 
respect  to  their  inclusions  ( i.e .,  LGBT);  people  are  more  interested  in 
offering  their  opinions  on  general  subjects  (i.e.,  “campus”)  than  on  specific 
ones  (i.e.,  “college  tours”,  “back  to  school”). 

Our  current  research,  although  exploratory,  points  towards  a few 
general  conclusions  when  using  social  media  or  Big  Data  for  education 
research.  First,  the  selection  of  keywords  and  hashtags  is  essential,  as  these 
are  going  to  determine  the  constraints  for  the  data  that  are  going  to  inform 
any  analysis.  Second,  while  there  is  considerable  discussion  on  Twitter  with 
respect  to  higher  education,  most  of  this  discussion  is  negative.  Third,  social 
media  is  a great  resource  of  information  for  education  policy,  as  it  gives  in 
real  time  the  opinions  of  the  parents  and  prospective  students  when  it  comes 
to  college  applications,  college  acceptance,  or  college  campuses. 
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Abstract 

In  this  article,  we  describe  the  new  threats  to  information  privacy  that  appear  as 
the  result  of  the  emerging  Big  Data  practices  and  methodologies  in  today’s 
networked  world.  In  particular,  the  collection  and  analysis  of  large-scale  data 
from  social  networking  sites  challenge  the  traditional  conceptualization  of 
privacy.  In  response,  a new  conceptual  framework  is  proposed  to  encompass 
three  key  dimensions  of  privacy  in  the  Big  Data  context:  information 
identifiability,  information  ephemerality,  and  information  linkability. 

Introduction 

The  “privacy  as  a right”  perspective,  first  introduced  by  Warren  and 
Brandeis  (1890),  has  since  influenced  numerous  opinions  and  court  cases 
on  privacy  and  law  enforcement  (searches  and  seizures),  privacy  and  self 
(abortions  and  embryos),  privacy  and  the  press  (private  facts  exposure  and 
celebrity  privacy),  privacy  in  the  workplace  (psychological  testing  and 
lifestyle  monitoring),  etc.  (Alderman  and  Kennedy,  1997).  However,  these 
issues  were  just  a subset  of  privacy  issues  which  Warren  and  Brandeis  were 
concerned  about  when  they  wrote  the  “right  to  privacy.”  Their  main 
concern  was  with  the  advent  of  technological  developments  (instant 
photography  and  audio  recordings  in  the  late  nineteenth  century)  that  were 
increasingly  revealing  personal  information  without  individuals’ 
awareness. 

Such  privacy  concerns  still  exist  and  remain  highly  relevant  after  125 
years.  In  today’s  Big  Data  Era,  many  data  collectors,  data  brokers, 
aggregation  services  and  various  companies  collect  and  use  personal  data 
without  individuals’  awareness,  which  leads  to  a dark  data  ecosystem.  It  has 
been  estimated  that  there  are  4000  separate  companies  involved  in  the  dark 
data  market  and  many  dark  data  brokers  make  the  data  available  to  any 
buyer  willing  to  pay  (Levine  2013). 
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The  emerging  field  of  big  data  analytics  is  distinguished  from 
traditional  data  analytics  by  its  three  key  characteristics  (McAfee  et  al. 
2012):  (1)  Volume  - Big  Data  analysis  works  with  petabytes  of  data  in  a 
single  dataset;  (2)  Velocity-  Real-time  or  nearly  real-time  information  is 
aggregated  and  analyzed  for  agile  decision-making;  and  (3)  Variety  - Big 
Data  takes  all  forms  of  information  ranging  from  sensor  readings  and  GPS 
signals  to  messages,  updates,  and  images  posted  on  Social  Networking  Sites 
(SNSs).  Collecting  large  amounts  of  data,  especially  personal  and  social 
data,  brings  both  opportunities  and  challenges.  While  many  practitioners 
believe  that  the  rise  of  Big  Data  has  potential  for  creating  better  tools  and 
services,  scholars  ( e.g .,  boyd  & Crawford  2012)  have  already  warned  about 
how  poor  execution  may  lead  to  negative  social  and  economic 
consequences  such  as  intrusion  to  personal  privacy,  suppression  to  speech, 
and  misleading  predictions,  among  many  others. 

Interaction  between  technological  innovations  and  social  ecology 
usually  has  consequences  far  beyond  the  immediate  purposes  of  the 
technical  devices  and  practices  (Kranzberg  1986).  One  of  the  major  threats 
that  Big  Data  analytics  posits  is  privacy,  as  it  seeks  to  identify  at  the 
expense  of  individual  and  collective  identity  (Richards  & King  2013). 
Viewing  Big  Data  as  a public  good,  Acquisti  (2014)  discusses  its  critical 
importance  for  public  decision-making,  and  how  it  can  reduce 
inefficiencies  and  increase  welfare  when  used  properly.  However,  Acquisti 
(2014)  also  questions  who  should  bear  the  economic  cost  of  Big  Data 
practices  that  use  personal  information:  data  subjects  (whose  data  are 
aggregated  and  analyzed),  data  holders  (who  collect  and  handle  consumer 
data),  or  both?  To  face  the  increasing  costs  associated  with  data  storage  and 
analysis,  data  aggregators,  and  data  holders  typically  assume  that  they  have 
rights  to  the  data  and  exploit  user  data  for  profit,  overriding  the  interests  of 
individuals  in  their  privacy  and  leaving  them  few  mitigating  measures 
(Wigan  & Clarke  2013).  Big  Data  practices  as  such  pose  significant  threats 
to  individual  privacy. 

This  paper  aims  at  discussing  the  following  challenges  to  information 
privacy  with  the  emergence  of  Big  Data:  (1)  What  are  the  unique  threats  of 
Big  Data  practices  to  information  privacy?  (2)  How  do  these  unique  threats 
challenge  the  conceptualization  of  privacy?  (3)  How  should  we  address 
privacy  challenges  in  today’s  networked  world? 
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Redefining  Privacy  in  a Big  Data  Context 
Information  Identifiability 

In  today’s  networked  world,  privacy  is  often  shaped  and  enabled  by 
various  features  of  technologies.  For  instance,  websites  often  require  users 
to  disclose  certain  types  of  information  in  order  to  obtain  the  services,  and 
provide  certain  mechanisms  and  tools  for  users  to  manage  their  privacy 
preferences.  Personal  information,  such  as  name,  location,  personal 
interests,  and  even  information  of  one’s  social  networks,  can  be  revealed 
voluntarily  by  the  users  to  socialize  and  establish  social  connections. 
However,  disclosure  of  private  information  can  have  significant 
consequences,  and  thus  trigger  users’  privacy  concerns  and  shape  their 
privacy  management  behaviors  (Fogel  & Nehmad  2009).  For  instance, 
popular  SNSs  such  as  Facebook  and  Twitter  require  various  levels  of 
information  disclosure  and  information  accuracy  by  design.  Facebook 
requires  users  to  provide  real  names  and  work/education  email  addresses  to 
be  added  to  an  affiliation  or  network.  Twitter,  on  the  other  hand,  does  not 
necessarily  require  real  names,  but  it  sets  users’  profiles  as  public  by 
default,  potentially  exposing  a large  amount  of  personal  information  to  the 
wide  audience  and  other  third  parties.  Further,  many  social  networking  and 
mobile  applications  monitor,  record  and  even  publish  users’  location 
information,  which  is  susceptible  to  unauthorized  disclosure. 

In  the  context  of  Big  Data,  we  argue  that  one  fundamental  dimension 
of  information  privacy  is  information  identifiability , which  is  defined  as  the 
amount  and  the  accuracy  of  personally  identifiable  information  being 
revealed.  Unique  to  Big  Data  practices,  individuals’  identities  can  be  easily 
identified  or  re-identified.  For  instance,  Narayanan  & Shmatikov  (2009) 
have  demonstrated  how  to  efficiently  de-anonymize  a large  number  of 
Twitter  and  Flickr  users  by  simply  using  data  of  username,  location,  and 
“follow”  or  “contact”  relationships.  Data  mining  using  vast  amounts  of 
identifiable  information  generate  hypotheses  and  discover  general  patterns 
that  could  actually  be  stereotypical  and  misleading,  possibly  causing  both 
privacy  loss  and  economic  loss  for  data  subjects,  and  posing  privacy  threats 
that  the  existing  privacy  laws  are  far  behind  to  define  or  protect  (Brankovic 
& Estivill-Castro  1999).  What  is  more  risky  is  that  Big  Data  analytics  can 
now  gather  and  extract  implicit  user  data  and  across  different  social 
networking  platforms.  Beyond  user  specified  data  such  as  usernames  and 
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locations,  SNS  services  now  can  automatically  generate  user  information 
through  mechanisms  such  as  face  recognition,  geo-tagging,  and  multi-site 
uploading,  further  increasing  the  amount  and  the  accuracy  of  personal 
information  (Smith  et  al.  2012).  These  data  are  usually  extracted  from  the 
uploaded  files  or  generated  through  metadata.  Users  are  often  unaware  that 
these  data  are  stored  and  can  be  used  for  identification. 

Thus,  information  identifiability  is  one  key  dimension  of  privacy,  and 
has  extensive  new  meanings  in  the  Big  Data  context.  Big  Data  tools 
significantly  increase  the  potential  to  identify  individual  users  through 
social  data  and  reveal  more  user  information  in  increasing  quantities  and 
accuracy.  The  lack  of  user  awareness  and  regulatory  mechanisms  to  control 
such  information  revelation  signifies  its  impact  on  information  privacy. 

Information  Ephemerality 

However,  information  identifiability  does  not  fully  capture  the  scope 
of  information  privacy,  especially  in  the  Big  Data  era.  Palen  and  Dourish 
(2003)  argue  that  privacy  is  not  only  about  the  identity  boundaries  defining 
self  versus  others,  but  also  the  temporal  boundaries  between  past,  present 
and  future.  Events  of  information  disclosure  are  not  isolated,  but 
sequentially  connected.  Therefore,  information  disclosed  at  a specific 
instance  becomes  contextualized  and  interpreted  in  relation  to  other  events 
and  situations,  if  the  latter  are  available.  In  our  daily  life,  information  tends 
to  be  ephemeral;  the  information  that  we  share  and  exchange  is  constrained 
to  a certain  physical  location  and  a certain  time  period  before  it  gets 
forgotten.  While  we  constantly  observe  the  action  of  forgetting  in  our  social 
life  (Mayer-Schonberger  2009)  and  in  social  norms  and  policy  (Blanchette 
& Johnson  2002),  recent  advances  in  information  technologies  have  offered 
inexpensive,  large-volume  digital  data  storage  capacity,  making  the 
persistence  of  information  the  odd  commonplace  (Ambrose  2012).  The 
extended  information  lifespan  has  significant  privacy  implications,  as  the 
preservation  of  personal  information  amplifies  and  prolongs  the  effect  of 
any  privacy  loss.  The  persistence  or  the  ephemerality  of  information  has  not 
been  a major  privacy  concern  in  the  past  few  decades,  but  more  recently,  the 
current  and  new  consensus  of  privacy  threat  is  formed  around  the  fact  that 
information,  once  online,  is  there  forever.  This  new  realization  has  brought 
attention  to  this  new  aspect  of  privacy — “the  right  to  be  forgotten” 
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(Ambrose  2012).  We  define  information  ephemerality  as  the  duration  of 
private  information  being  available,  accessible  and  stable,  an  increasing 
significant  dimension  of  information  privacy  in  close  relation  to  big  social 
data. 

While  the  traditional  forms  of  data  analytic  tools  may  not  be  able  to 
handle  large-scale  longitudinal  data,  Big  Data  technologies,  in  particular, 
can  use  the  persisting  records  of  social  data,  sometimes  beyond  a single 
SNS  platform,  and  change  the  availability  and  accessibility  of  information 
from  the  “here  and  now”  to  the  “everywhere  and  forever”  (Grudin  2002). 
The  accumulated  user  data  on  Facebook  alone  have  been  used  to  reveal  the 
evolution  of  user  interactions  over  three  years  (Viswanath  et  al.  2009),  the 
longitudinal  changes  in  privacy  and  disclosure  behaviors  in  six  years 
(Stutzman  et  al.  2012),  as  well  as  the  year-long  variation  of  national 
happiness  levels  (Facebook  2010).  However,  changing  the  ephemeral 
nature  of  information  and  making  longitudinal  analysis  of  such  big  social 
data  can  be  damaging.  When  modeling  large  datasets  over  time,  many 
time-sensitive  factors  may  come  into  play  to  influence  outcomes.  Without 
considering  these  factors  and  changes  over  the  course  of  time,  data  will  be 
taken  out  of  context,  often  lose  meaning  and  value,  and  be  interpreted  in 
misleading  ways.  For  instance,  boyd  and  Crawford  (2012)  point  out  that  the 
types  of  social  networks  derived  from  mining  a longitudinal 
dataset — “articulated  networks”  (networks  resulted  from  people  specifying 
contacts  through  mechanisms  such  as  friend  lists  or  instant  messenger  lists) 
and  “behavioral  networks”  (networks  derived  from  communication  patterns 
such  as  email  exchanges  and  Facebook  photo-tagging) — tend  to  be 
inequivalent  to  true  personal  networks.  “False  discoveries”  like  this  made 
out  of  the  large-scale  social  data  not  only  breach  personal  privacy,  but  may 
have  severe  real-world  consequences  affecting  the  products,  bank  loans, 
and  health  insurance  a person  receives. 

Information  persistence  is  a unique  “big  social  data”  threat  to  users’ 
information  privacy.  Because  the  real  world  is  one  that  is  ephemeral  rather 
than  permanent,  individuals  apply  the  same  kind  of  expectations  to  their 
online  disclosure,  expecting  the  information  that  they  share  online  will  not 
be  everlasting  (Shein  2013).  Big  Data  tools  not  only  serve  to  document  and 
store  longitudinal  redords  of  private  information,  but  also  use  and  analyze 
them  for  inferences,  knowledge,  and  trends  regarding  users’  behavioral 
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intentions  and  social  implications,  significantly  challenging  users’  privacy 
expectations  and  violating  their  privacy  rules. 

Information  Linkability 

User  data  on  SNSs  concern  not  only  information  about  the  users 
themselves,  but  also  information  about  the  users’  colleagues,  friends,  and 
others  they  come  into  contact  with.  As  SNSs  facilitate  connectedness  across 
boundaries  and  in  dynamic  ways,  neither  a one-time  snapshot  nor  an 
over-time  trace  of  a single  user’s  profile  can  fully  capture  the  complexity  of 
SNS  data  (boyd  & Crawford  2012).  Unique  to  the  social  data  generated  and 
accumulated  online,  information  privacy  is  dependent  not  on  one  single 
user,  but  on  a web  of  users  to  whom  this  individual  is  connected  and  on  the 
information  that  they  disclose.  Xu  (2012)  proposes  the  notion  of  privacy 
2.0,  describing  this  phenomenon  that  information  disclosure  is 
co-constructed  by  users  and  their  social  connections,  which  demands  the 
responsibilities  of  privacy  protection  to  be  distributed  through  their  social 
networks.  Following  Xu  (2012),  we  suggest  that  information  linkability  as 
the  third  key  dimension  of  information  privacy  in  the  Big  Data  context,  and 
define  it  as  the  degree  to  which  information  is  relational  and  linked  through 
social  connections. 

As  privacy  scholars  (Lampinen  et  al.  201 1;  de  Wolf  et  al.  2014)  have 
recently  observed,  the  connected  nature  of  SNS  data  and  the  interpersonal 
nature  of  information  sharing  have  made  individualistic  privacy  protection 
strategies  inadequate.  Even  if  a user  adopts  tight  privacy  settings,  his  or  her 
personal  information  could  still  be  accessed  or  misused  by  their  friends’ 
ignorance  of  privacy  and  security  (Xu,  2012).  As  a result  of  such 
information  linkability,  SNS  data  are  often  gathered  and  exploited  without 
the  consent  of  the  individuals  to  whom  the  data  relate,  and  individuals  who 
volunteer  such  data  only  have  moral  responsibility  for  their  actions  (Wigan 
& Clarke  2013). 

Privacy  risks  in  relation  to  information  linkability  become  an 
especially  prominent  problem  with  Big  Data  practices.  Many  analytic  tools 
are  specifically  designed  for  social  network  analysis  to  draw  patterns  and 
insights  from  cliques,  groups,  and  even  large  social  networks  (Davenport  et 
al.  2013).  To  address  this  emerging  privacy  issue,  Troshynski  et  al.  (2008) 
argue  that  users,  researchers,  and  practitioners  of  big  social  data  consider 
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not  only  personal  privacy  implications  but  accountability,  a broader  concept 
that  encompasses  the  privacy  considerations  in  a multi-directional 
relationship  - accountability  for  others’  personal  privacy,  accountability  to 
superiors,  colleagues,  and  to  the  public.  To  achieve  accountability, 
considerations  need  to  be  given  to  the  control  of  and  power  regarding  the 
access  and  use  of  linked  and  relational  information  and  the  differences 
between  accessibility  and  publicness  (boyd  & Crawford  2012). 

The  Emerging  Field  of  Human-Data  Interaction  (HDI) 

The  technological  advancement  on  machine  learning  and  automated 
content  analysis  continues  to  improve  the  strength  of  today’s  big  data 
ecosystem.  To  address  privacy  concerns,  many  privacy  scholars  suggest 
that  individuals’  awareness  of  privacy  should  be  enhanced  by  providing 
information  transparency  about  what  data  is  collected,  how  it  is  used,  and 
whom  it  is  shared  with  (Wang  et  al.  2013;  Xu  et  al.  2012).  In  the  privacy 
literature,  researchers  have  examined  multiple  ways  of  enhancing 
transparency,  such  as  providing  explicit  textual  privacy  statements  (Pollach 
2006),  presenting  privacy  facts  in  the  form  of  nutrition  labels  (Kelley  et  al. 
2009),  using  warning  icons  to  suggest  suspicious  data  use  (Lin  et  al.  2012), 
and  using  justification  messages  to  explain  information  disclosure 
(Knijnenburg  and  Kobsa  2013).  Transparency  is  also  at  heart  of  existing 
and  proposed  regulatory  schemes.  For  instance,  the  U.S.  Consumer  Privacy 
Bill  of  Rights  suggests  that  “companies  should  provide  clear  descriptions  of 
[...]  why  they  need  the  data,  how  they  will  use  it”  (White  House  2012). 

While  empowering  individuals  with  privacy  comprehensiveness  is  a 
desirable  approach  to  raise  awareness,  information  transparency  itself 
cannot  guarantee  privacy.  If  implemented  inappropriately,  the  strategies 
can  even  backfire.  For  instance,  practices  to  enhance  information 
transparency  have  been  criticized  for  i)  burdening  users’  cognitive  load  by 
having  users  process  long  and  ambiguous  statements,  and  ii)  leading  to  a 
“context  collapse”  where  users  lack  contextual  explanations  and 
justifications  to  aid  their  real-time  privacy  decision  making  (Vitak  2012). 
Therefore,  in  raising  users’  privacy  awareness,  it  becomes  imperative  to 
find  an  effective  way  to  present  and  implement  transparency. 

In  this  article,  we  argue  that  privacy  researchers  who  are  interested  in 
addressing  Big  Data  privacy  challenges  are  likely  to  benefit  from  the 
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emerging  field  of  Human-Data  Interaction  (HDI)  (Mortier  et  al.  2014).  HDI 
emphasizes  on  creating  a collaborative  but  sometimes  combative  data 
ecosystem  around  multiple  stakeholders  engaging  in  the  collection  and  use 
of  personal  data.  The  HDI  approach  does  not  throw  out  transparency 
entirely,  but  gently  refocuses  this  paradigm  onto  individuals’  privacy 
awareness  that  would  enable  legibility , agency  and  negotiability. 

Legibility  is  concerned  with  making  data  space  (from  collection,  use, 
analysis,  to  retention)  both  transparent  and  comprehensible  (Mortier  et  al. 
2014).  To  achieve  the  goal  of  legibility,  researchers  need  to  create 
innovative  mechanisms  to  visualize:  who  has  collected  what  private  data; 
how  the  private  data  are  being  processed;  how  their  private  data  are  mingled 
with  others’  private  data;  what  is  done  by  the  data  brokers;  and  who  are 
using  their  private  data  and  how.  We  argue  that  legibility  empowerment  is  a 
precursor  to  an  individual’s  ability  to  exercise  agency  in  situations  where 
personal  data  are  being  collected  and  used. 


Agency  is  concerned  with  giving  people  the  capacity  to  act  within  the 
data  ecosystem.  Consistent  with  Mortier  and  associates  (2014),  we  do  not 
believe  that  all  individuals  should  continually  exercise  this  capacity;  but 
some  of  them  can  have  the  agency  whenever  they  wish  to.  Thus  privacy 
researchers  and  technologists  need  to  operationalize  agency  through 
intelligent  personalized  approach  by  providing  individuals  with  the 
customized  option  of  expressing  concern  over  certain  data  use  which  they 
do  not  agree  with. 


Negotiability  is  concerned  with  many  dynamic  relationships  that 
arise  around  data  and  data  processing  (Mortier  et  al.  2014).  This  theme 
requires  collaboration  and  engagement  with  stakeholders  to  collaboratively 
decide  what  and  why  data  exchanges  occur,  as  well  as  specify  the 
information  flow  “from  whom,”  “to  whom,”  “for  what  reasons,”  and  “under 
what  conditions.”  Placing  discussions  on  negotiability  empowerment 
within  the  relevant  contexts  does  not  suggest  that  there  is  an  agreement 
about  the  level  of  privacy  that  is  appropriate  in  any  given  context.  However, 
knowing  the  relevant  dimensions  and  stakeholders  of  information  flow  in 
the  specific  context  does  clarify  the  discussion. 
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Conclusion 

This  article  suggests  a new  approach  to  conceptualizing  privacy  with 
emphasis  on  emerging  privacy  threats  in  terms  of  information 
identifiability,  information  ephemerality,  and  information  linkability.  The 
latter  two,  in  particular,  are  of  growing  importance,  and  pose  significant  and 
unique  threats  to  information  privacy  with  the  emergence  and  widespread 
of  Big  Data  technologies. 

The  reconceptualization  of  information  privacy  in  these  three 
dimensions  provides  a unique  opportunity  for  the  emerging  field  of 
Human-Data  Interaction  (HDI).  It  delineates  three  mechanisms  through 
which  big  social  data  analysis  may  influence  users,  and  serves  as  a 
theoretical  foundation  for  future  user-centered  studies  of  privacy  concerns 
and  privacy  decision-making  concerning  Big  Data  practices  and  products. 
This  conceptual  framework  can  further  guide  privacy  research  and  ethics 
discussions  to  draw  economic,  social  and  legislative  implications  of  Big 
Data  practices,  as  well  as  finding  practical  solutions  to  these  three  privacy 
challenges. 
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Editorial  Comments 

This  Winter  issue  of  the  Journal  has  some  interesting,  eclectic  things  to 
share.  The  Academy  has  signed  a contract  with  JSTOR  - a company  that 
digitizes  science  journals  for  online  access.  We  start  with  a description  of 
that  process.  Eventually  they  will  have  digitized  all  our  back  issues. 

Following  that  is  a tribute  to  WAS  Fellow  Jay  Uhlaner  reprinted 
with  permission  from  the  Human  Factors  & Ergonomics  Society  Bulletin 
and  written  by  G.  Kruger. 

Next  is  a graduate  student  paper  that  describes  geological  studies 
done  in  the  Central  African  Republic  around  the  region  of  Boali.  Noted  for 
its  waterfalls  Les  Chutes  de  la  Mbi  is  a 656-foot  cascade  where  the  Upper 
M’poko  River  meets  the  Oubangui  River.  The  natural  beauty  of  the  site  has 
earned  it  a place  on  the  tentative  UNESCO  World  Fleritage  Site  list.  The 
Falls  of  Boali  are  250  m wide  and  50  m high,  and  are  a popular  tourist 
destination.  I do  not  usually  put  photos  in  the  editorial  comments,  but  this 
is  an  exception.  The  Falls  of  Boali  are  shown  below. 
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This  is  the  first  paper  we  have  received  from  there.  The  author  is  a student 
at  the  China  University  of  Geosciences  in  Wuhan  China.  He  speaks  French 
and  Chinese  but  not  English,  so  it  was  a challenge  to  get  the  paper  first  into 
English  and  second  into  the  structure  of  a technical  and  publishable  paper. 
It  took  several  months  of  work,  and  I am  proud  to  present  the  work  for  his 
Master’s  thesis. 

Then  comes  a paper  on  the  “wow”  signal.  This  one  has  a back  story. 
In  1977  a radio  telescope  in  Ohio  received  an  intense,  short  signal.  At  that 
time  data  were  recorded  on  a chart  recorder  - a bit  like  a lie  detector  setup. 
A sheet  of  paper  rolls  out  with  a moving  pen  recording  the  data.  The  signal 
the  telescope  received  was  so  large  that  the  telescope  operator  wrote  “wow” 
on  the  paper  - hence  the  name  the  “wow”  signal.  Talk  to  any  radio 
astronomer,  mention  the  “wow”  signal,  and  they  will  know  the  story.  To 
date  it  has  not  been  explained.  This  paper  offers  a possibility  for  the  signal. 

Last  up  is  a paper  that  discusses  the  metric  based  general  theory  of 
relativity  and  dark  energy.  The  author  provides  a view  of  the  problem  of 
dark  energy  based  on  Schrodinger’s  affine  field  theory.  Put  on  your  serious 
math  hats  for  this  one.  It  may  explain  that  odd  “cosmological  constant”,  A. 

As  usual  the  Winter  issue  ends  with  a list  of  members  of  the 
Academy. 
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Sethanne  Howard 
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The  Washington  Academy  of  Sciences  has  signed  an  agreement 
with  the  JSTOR  archive  dedicated  to  preserving  scholarly  literature.  The 
complete  back  run  of  the  Journal  of  the  Washington  Academy  of  Sciences 
(JWAS),  which  dates  to  1899,  will  be  digitized  and  made  available  via  the 
JSTOR  online  platform. 

In  addition  to  the  Washington  Academy,  more  than  1,050 
publishers,  including  scholarly  societies  and  publishing  academies  of 
sciences,  are  currently  part  of  the  JSTOR  archive  which  hosts  some  2,200 
digitized  journals  comprised  of  9 million  digitized  articles  in  various 
collections.  For  example,  the  oldest  journal  in  the  JSTOR  collections  is  the 
Proceedings  and  Transactions  of  the  Royal  Society  of  London , which 
dates  back  to  1665. 

More  than  8,000  institutions  from  175  countries  make  use  of 
JSTOR,  including  universities,  secondary  schools,  government  and  non- 
profit organizations,  community  colleges,  museums,  and  public  libraries. 
The  Academy’s  former  president  Terrell  Erickson  says,  “This  is  a terrific 
opportunity  for  the  Washington  Academy  of  Sciences  as  it  expands  our 
Journal’s  reach  beyond  our  current  subscriber  base  to  a much  larger 
audience.” 

Several  programs  help  to  make  sure  that  the  archive’s  contents  are 
widely-available  at  a reasonable  cost  to  users  such  as  students  and  other 
science  professionals.  For  instance,  in  addition  to  the  8,000+  subscribing 
libraries,  JSTOR  is  also  available  to  individual  unaffiliated  researchers 
who  can  access  single  articles  through  various  JSTOR  accessibility 
programs  like  “Register  & Read”  and  “JPASS.”  JSTOR  further  supports 
the  African  Access  Initiative  (AAI)  and  Developing  Nations  Access 
Initiative  (DNAI),  and  these  initiatives  waive  or  reduce  fees  for  1,279  not- 
for-profit  and  academic  institutions  in  developing  countries. 

The  not-for-profit  JSTOR  archive  was  conceived  to  help  libraries 
and  publishers  respond  to  the  rising  costs  associated  with  the  storage  of 
printed  journal  literature  and  to  ensure  that  this  material  would  not  be 
“lost”  as  academic  research  became  increasingly  electronic.  Through  the 
digitization  of  complete  journal  runs,  JSTOR  makes  it  possible  for 
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subscribing  libraries  to  share  the  costs  associated  with  storage  and 
maintenance  of  journal  literature,  as  the  non-destructive  digitization 
process  will  be  done  at  no  cost  to  the  Academy.  Furthermore,  JSTOR  is 
offering  the  Academy  a modest  revenue-sharing  arrangement  based  upon 
access  to  JSTOR  by  its  users. 

JSTOR  will  work  from  print  copies  of  JWAS  to  create  image  files 
that  are  exact  replicas  of  the  original  Journal  pages  and  text  files  that 
enable  searching.  Upon  completion  of  this  process,  users  will  be  able  to 
conduct  full-text  searches  back  to  the  first  volume  and  issue  in  1 899  when 
JWAS  was  called  the  Washington  Academy  of  Sciences  Proceedings. 
Scholars  will  then  be  able  to  browse,  search,  view,  and  print  JWAS  and  the 
earlier  Proceedings  directly  from  their  desktops. 

The  Academy  will  retain  the  copyright  to  the  material  published  in 
its  Journal , as  the  JSTOR  license  agreement  is  non-exclusive.  The 
Academy  is  planning  to  convert  the  current  JWAS  hard-copy  format  to  an 
online  version,  and  is  exploring  hosting  JWAS  electronically  at  its  own 
website  for  its  members  and  numerous  paid  individual  and  institutional 
subscribers  who  will  be  the  only  ones  to  have  access  the  current  and  recent 
issues  of  JWAS.  There  will  be  a 3-year  gap  between  the  most-recently 
published  issue  of  the  Journal  and  the  last  issue  available  in  JSTOR.  This 
window  of  time  is  being  designated  for  the  puipose  of  separating  these 
paid  subscribers  and  members  from  the  older  issues  which  will  be 
available  via  the  archive. 

Questions  about  the  Academy’s  Journal  can  be  directed  to  JWAS 
editor  Sethanne  Howard,  sethanneh@msn.com  . 
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MEMBER  MILESTONES 
Julius  “Jay”  Earl  Uhlaner  ( 1 9 1 7 - 20 1 5)' 

By  Gerald  P.  Krueger 

Human  Factors  and  Ergonomics  Society  (HFES)  Fellow 
Julius  “Jay”  Earl  Uhlaner  was  born  in  Vienna,  Austria, 
in  1917.  In  1928,  he  immigrated  to  the  United  States, 
where  he  became  a naturalized  citizen  and  left  a lasting 
legacy  through  his  leadership  and  research 
achievements,  especially  in  applying  psychology  to 
military  problems. 

, ..  „ , ....  Jay  graduated  from  City  College  of  New  York  in  1938 

Julius  Earl  Uhlaner 

with  a BS  in  science.  He  worked  in  human  engineering 
at  Ford  Motor  Company  in  Michigan  from  1939  to  1940  and  established  a 
driver  research  lab.  In  his  early  human  factors  work,  he  focused  on  driver 
vision,  training,  and  safety  issues.  These  interests  led  to  his  thesis  work  for 
his  MS  in  psychology  and  statistics  from  Iowa  State  University  in  1941 . His 
contributions  to  highway  safety  included  significant  research  on  the 
visibility  and  interpretability  of  roadway  signs  with  different  types  of 
lettering  (e.g.,  height/width  ratios  of  letters).  He  served  on  the  Highway 
Safety  Research  Board  in  Lansing,  Michigan,  and  dealt  with  human  factors 
issues. 

While  serving  as  a psychologist  in  the  Army  Air  Coips  during 
World  War  II  from  1943  to  1946,  Jay  was  involved  with  developing  criteria 
for  selecting  pilots.  From  1946  to  1947,  he  was  assistant  director  for 
research  and  training  for  the  New  York  State  Division  of  Veteran  Affairs. 
Combining  his  bent  for  human  factors  and  personnel  selection,  he  earned  a 
PhD  in  industrial  and  organizational  psychology  at  New  York  University  in 
1947. 

Jay  then  joined  the  Army  Personnel  Research  Branch  as  a research 
psychologist.  As  the  organization  grew,  it  eventually  became  the  Behavior 
Systems  Research  Lab  (BSRL).  In  1969,  Jay  became  BSRL  technical 
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director.  Two  years  later,  he  also  took  on  the  title  of  chief  psychologist  of 
the  U.S.  Army,  which  is  still  worn  today  by  the  director  of  BSRL’s  even 
broader-based  successor  organization:  The  Army  Research  Institute  (ARI) 
for  the  Behavioral  and  Social  Sciences.  Under  Uhlaner’s  visionary 
guidance,  ARI  gradually  took  on  missions  to  develop  and  improve  the 
performance  of  people  in  the  Army  through  behavioral  sciences  research  on 
personnel  selection,  classification,  job  placement,  training  systems,  and 
human  factors  in  systems  design.  With  Uhlaner  at  its  helm  from  1969  to 
1978,  ARI  grew  to  employ  more  than  400  research  psychologists,  many  of 
them  well  steeped  in  and  practicing  classical  human  factors  methods  and 
attaining  many  noteworthy  accomplishments. 


Jay  was  best  known  for  some  of  his  innovative  contributions  to  the 
Army.  He  foresaw  early  on  the  movement  toward  reliance  on  computers 
and  automation  and  had  ARI  focus  on  “person-in-the-loop”  approaches  to 
examining  soldier-system  interface  situations  wherein  the  infusion  of  new 
technologies  could  enhance  soldier  performance,  training  systems,  and 
equipment  system  testing.  He  spearheaded  development  of  the  first 
psychological  military  qualifications  test  legislated  by  Congress;  introduced 
computers  as  major  tools  and  partners  in  behavioral  science  research; 
pioneered  research  on  night-vision  testing  and  driver  performance; 
introduced  the  first  classification  system  based  on  psychological  aptitude 
testing  in  the  military  services;  pioneered  the  “system  measurement  bed,”  a 
methodology  that  influenced  industrial  psychology;  and  fostered  an 
interdisciplinary  approach  to  ARBs  research. 


During  his  career,  Jay 
published  close  to  200  articles  in 
scientific  journals  and  books  on  the 
subjects  of  industrial  psychology, 
military  psychology,  and  related 
topics.  In  1976,  President  Gerald  R. 
Ford  awarded  him  the  U.S. 
Presidential  Award  for  Management 
Improvement  for  his  commanding 
role  in  the  development  and 
implementation  of  the  Army 
Classification  Battery  and  Aptitude 


Jay  Ulilaner  as  director  of  the  U.S.  Army 
Behavior  & System  Research  Laboratory 
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advances  in  the  field  of  soldier  performance  prediction.  In  1995,  the 
American  Psychological  Association’s  (APA)  Division  19  (Military 
Psychology)  recognized  Jay  with  the  Lifetime  Achievement  Award  in 
Military  Psychology  for  his  many  accomplishments  in  the  application  of 
behavioral  science  research  to  military  problems.  In  2011,  Division  19 
initiated  an  award  in  his  name:  the  Julius  E.  Uhlaner  Award  for 
Distinguished  Contributions  to  Research  on  Military  Selection  and 
Recruitment. 

In  addition,  Jay  was  a Fellow  of  FIFES,  APA,  and  the  Washington 
Academy  of  Sciences  (WAS).  In  1976,  WAS  granted  him  the  first  award 
“for  scientific  work  of  high  merit  in  behavioral  sciences”  (see  below). 

After  retiring  from  the  Army  in  1978,  Jay  was  senior  vice  president 
at  Perceptronics,  Inc.,  a human  performance  modeling,  simulation,  and 
training  consulting  firm  in  California  (at  that  time).  One  of  the  more  notable 
programs  he  fostered  as  part  of  a consortium  for  Defense  Advanced 
Research  Projects  Agency  (DARPA)  was  SIMNET,  which  offered  a tank 
battle  3-D  virtual  simulation  training  network  that  permitted  dozens,  if  not 
hundreds  of  operators  in  tanks,  helicopters,  close  support  aircraft,  and  other 
battlefield  entities  to  interact  with  one  another  during  war  game  training.  At 
Perceptronics,  Jay  also  did  extensive  work  in  mining  safety  for  the 
Department  of  Commerce.  He  retired  in  2000  but  continued  as  a member  of 
the  board  of  directors.  Subsequently,  he  carried  out  his  own  part-time 
behavioral  sciences  consulting  work  for  another  decade. 

Having  watched  him  from  a short  distance,  I can  say  that  Jay 
Uhlaner  continually  demonstrated  significant  political  and  scientific  savvy 
in  dealing  with  bureaucracy  and  in  getting  things  done.  He  was  particularly 
adept  at  obtaining  buy-in  to  build  up  human  factors  research  psychology  in 
the  military  by  having  his  staff  seek  to  provide  what  the  country’s  leaders 
and  soldiers  needed  most. 

Jay’s  family  can  be  contacted  through  his  beloved  wife  of  66  years, 
Vera  Uhlaner,  at  P.O.  Box  967,  Corona  del  Mar,  CA  92625-9998. 
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Award  by  the  Washington  Academy  of  Sciences 

The  presentation  was  made  at  the  Annual  Awards  Dinner  meeting  of  the 
Academy  on  Thursday,  March  18,  1976,  at  the  Cosmos  Club. 

Dr.  Julius  E.  Uhlaner,  Chief  Psychologist  of  the  U.  S.  Army  and  Technical 
Director  of  the  Army  Research  Institute  for  the  Behavioral  and  Social 
Sciences,  and  Adjunct  Professor  of  Psychology  at  George  Washington 
University,  was  cited  for  “his  outstanding  technical  direction  and  leadership 
in  Applied  Psychology.”  As  a psychologist,  he  is  best  known  for 
contributions  to  military  psychology,  having  spent  the  major  part  of  his 
career  as  a civilian  research  psychologist  in  the  Army.  However,  he  also 
kept  closely  in  touch  with  academia  and  industry.  He  is  best  known  for  some 
of  his  innovative  contributions  to  the  Army,  having  developed  the  first 
psychological  military  qualifications  test  legislated  by  Congress;  introduced 
the  use  of  the  computer  as  a major  tool  and  partner  in  Behavioral  Science 
research;  pioneered  night  vision  testing  research  and  driver  research; 
introduced  the  first  differential  classification  system  based  on  psychological 
aptitude  testing  anywhere  in  the  military  services;  pioneered  the  “system 
measurement  bed,”  a methodology  which  influenced  the  field  of  industrial 
psychology;  and  fostered  the  interdisciplinary  approach  to  much  of  his 
research.  Also,  he  has  exhibited  very  active  professionalism,  including  the 
holding  of  elective  offices  in  divisions  of  the  American  Psychological 
Association. 

His  awards  in  the  Federal  service  include  the  Citation  for  Meritorious 
Civilian  Service,  1960;  Citation  for  Exceptional  Civilian  Service,  1969;  and 
Citation  for  Outstanding  Performance,  1972. 

His  combination  of  experience  and  education  led  to  his  trademark  for  the 
conduct  of  research  in  the  Behavioral  Sciences  — an  interdisciplinary 
approach,  systems  oriented,  and  the  use  of  research  products. 

He  was  elected  a Fellow  of  the  Washington  Academy  of  Sciences  in  1963; 
he  was  also  a Fellow  of  the  American  Psychological  Association.  He  was  a 
Fellow  of  the  Human  Factors  Society  and  the  Iowa  Academy  of  Sciences. 
Other  societies  of  which  he  is  a member  are  the  Operations  Research 
Society  of  America,  International  Association  of  Applied  Psychology, 
Psychonomics  Society,  and  District  of  Columbia  Psychological 
Association. 
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Narcisse  Bassanganam,  Yang  Mei  Zhen,  Prince  E.  Yedidya 
Danguene,  Minfang  Wang 

Earth  Resource,  China  University  of  Geosciences,  Wuhan,  China 
Earth  Faculty,  University  of  Bangui,  Central  African  Republic 

Abstract 

Located  in  the  Central  African  Republic,  the  region  of  Boali  is  noted  for  its 
waterfalls  and  for  the  nearby  hydroelectric  projects.  The  waterfalls  of  Boali 
are  250  m wide  and  50  m high,  and  are  a popular  tourist  destination.  The 
Central  African  Republic  (CAR)  has  large  reserves  of  Granitoids  that  remain 
largely  untapped.  That  is  why  these  rocks,  which  outcrop  and  which  constitute 
the  base  of  the  Boali  region  and  its  surroundings,  caught  our  attention. 
Previous  studies  by  Bowen  (1915)  explained  the  order  of  appearance  of 
various  minerals  as  a function  of  the  temperature  and  initial  magma  (SiCE) 
content.  According  to  Bowen’s  diagram,  we  can  say  that  the  magma 
underwent  a magmatic  differentiation  giving  rocks  that  are  poor  in  silica 
(Diorite),  followed  by  rocks  rich  in  silica  (Granodiorite  and  Granite).  Knowing 
the  absolute  age  of  the  Granitoids  on  the  edge  of  the  craton  of  Mbomou  (2.1 
Ga,  Moloto  et  al.,  2008,  and  Toteu  et  al,  1994),  we  can  deduce  the  chronology 
of  other  formations.  Initially  there  was  the  formation  of  the  metamorphic 
formations  and  sandstones  of  Boali.  This  was  followed  by  a slow  intrusion  of 
magma  which  crystallized  in  depth  to  give  grainy  rock  (granitoids  and 
pegmatite)  in  the  region  of  Boali.  This  intrusion  had  metamorphosed  the  pre- 
existing formations  through  an  orthogneiss. 

Introduction 

Boali  is  a town  located  in  the  Ombella  M’poko  prefecture  of  the  Central 
African  Republic  (CAR)  (See  Figure  1).  It  is  located  100  km  northwest  of 
Bangui,  the  capital  of  CAR.  Boali  is  between  18°7'0"E  longitude,  and 
4°48'0"N  latitude.  Access  is  through  the  National  Road  1 (RNl). 

Boali  is  a sub-prefecture  in  the  CAR.  The  CAR  is  divided  into  16 
administrative  prefectures,  two  of  which  are  economic  prefectures,  and  one 
an  autonomous  commune;  the  prefectures  are  further  divided  into  71  sub- 
prefectures. The  prefectures  are  Bamingui-Bangoran,  Basse-Kotto,  Haute- 
Kotto,  Haut-Mbomou,  Kemo,  Lobaye,  Mambere-Kadei',  Mbomou,  Nana- 
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Mambere,  Ombella-M'Poko,  Ouaka,  Ouham,  Ouham-Pende  and  Vakaga. 
The  economic  prefectures  are  Nana-Grebizi  and  Sangha-Mbaere,  while  the 
commune  is  the  capital  city  of  Bangui. 


Fig.  1:  Map  of  CAR  showing  Ombella  M’poko  and  Boali  study  area 


The  Central  African  Republic  is  a country  rich  in  mineral  resources 
with  an  important  reserve  of  Granitoids.  Granitoid  or  granitic  rock  is  a 
variety  of  coarse  grained  plutonic  rock  similar  to  granite  which  is  composed 
predominantly  of  feldspar  and  quartz.  These  rocks  outcropped  and 
constitute  the  base  of  the  Boali  region,  but  unfortunately  are  not  exploited. 

Geologically  Boali  is  very  interesting  because  of  its  Granitoids.  We 
will  identify  and  define  the  importance  and  usefulness  of  the  Granitoids  not 
only  to  geology,  but  also  for  the  economy  and  social  development  in  the 
CAR.  We  note  that  a school  was  built  at  Crossing-Boali  in  1953  by  the  priest 
Alosiste  Gezst,  and  recently,  in  2001,  the  College  of  General  Education 
(C.G.E)  was  built. 
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Early  geologic  studies  by  Bowen  (1915)  defined  the  order  of 
appearance  of  various  minerals.  Cornacchia  et  al.  (1989)  described  the 
geologic  formations  of  Boali.  They  mentioned  that  the  greenstone  belt  of 
Bogoin-Boali  rocks  represents  a succession  of  structures  with  a narrow 
synclinal  appearance  drawing  a large  half  circle.  These  structures  end  in  the 
east  in  the  Bogoin  area  and  to  the  north  in  the  Boali  sector  as  the 
outcroppings  observed  north  of  Boali. 

Poidevin  (1979)  defined  the  geochemistry  of  Precambrian  basaltic 
rocks  from  the  CAR;  at  Mbi  not  far  from  the  river  M’poko  there  are  three 
types  of  petrographs:  Schist  sencitic,  chlorite  schist,  and  quartzite. 
Cornacchia  and  Dars  (1983)  showed  that  a corridor  of  faults  cut  north  of  the 
CAR  existed.  Cornacchia  et  al.  (1985)  found  in  the  sandstone  quartz  veins 
containing  crystals  of  rocks.  Poidevin  and  Pin  (1986)  showed  that  the 
outcropping  is  plural-kilometric  with  an  intrusion  of  dolerite  and  granites. 

Lithological  studies  of  the  Boali-Bogoin-Mbi  region  by  Cornacchia 
and  Giorgi  (1986)  defined  a vast  area  ranging  from  the  border  of  the 
republic  of  the  Congo  to  south  of  the  Lobaye  Subit-Possel  road  including 
the  Boda  area.  Their  work  was  earned  out  south  of  the  M’poko  River  and 
continued  from  the  town  of  Bogoin  to  Yangana  up  to  the  Yasi  series  in  the 
area  of  Bangui. 

Biandja  (1988)  earned  out  his  work  largely  in  the  northern  region  of 
the  Bogoin.  Biandja  (2000)  pointed  out  that  the  southern  part  of  the  Boali 
region  is  characterized  by  a series  of  “Mbi”  (waterfalls)  incorporated  from 
the  bottom  upwards.  The  series  contains  amphibolites  of  Mbali  and  Mbi  and 
pillow  basalts.  All  the  intruded  granite  is  in  the  lower  course  of  the  river 
Mandjo.  North  of  the  Bako  village  on  the  Mbi,  this  succession  of  granite 
becomes  abnormal  when  it  contacts  the  red  sandstone  and  the  red  shale  of 
the  base  of  the  sandstone  shale  set.  In  the  northern  region  of  the  Bogoin 
there  is  a succession  of  chloritized  migmatite  and  amphibolites  that  include 
some  biotite  in  the  faults  area.  There  is  also  migmatized  ferruginous 
quartzite.  The  sub-horizontal  schistose  sandstone  does  not  conform  to  the 
christallophylliennes  formations.  However,  the  whole  region  of  Boali  does 
show  some  similarities  between  the  north  and  south. 

According  to  studies  done  by  Poidevin  (1991),  Biandja  (1988);  and 
Cornacchia  et  al.  (1985-1989),  the  Boali  region  forms  the  southern  part  of 
a greenstone  belt  that  represents  the  northwestern  part  of  Bogoin-Boali.  The 
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orientation  of  this  greenstone  belt  runs  east-west  ending  at  the  eastern  edge 
and  marries  with  an  intrusive  granite  border  at  the  western  edge.  According 
to  the  report  from  the  meteorological  station  of  Mbali  covering  1993  to 
2000,  the  average  annual  rainfall  generally  ranges  from  1900  mm  in 
February  to  2630  mm  in  December,  with  an  average  maximum  of  2868  mm. 
The  Granitoids  are  on  the  edge  of  the  craton  of  Mbomou  (age  2.1  Ga) 
(Moloto  et  a/.,  2008;  Toteu  et  a/,  1994).  A craton  is  a large,  stable  block  of 
the  Earth’s  crust  forming  the  nucleus  of  a continent.  Recent  studies  by  Rolin 
(1992)  focused  in  the  Central  African  Republic  area  of  pan-African  strike- 
slip  of  the  Oubanguides.  In  general,  Djebebe-Ndjiguim  (2013)  found  that 
the  density  of  the  vegetation  made  it  very  difficult  to  search  for  significant 
outcrops. 

We  continue  their  work  to  include  not  only  new  information  on  the 
geologic  formation  of  the  Boali  region,  but  also  to  note  the  effect  that  non- 
exploitation of  the  granitoids  in  the  area  has  on  the  region.  It  is  a complex 
issue.  Consequently  the  granitoids  have  not  contributed  to  the  social 
development  in  the  Boali  area  in  particular  and  to  the  CAR  in  general. 

Techniques  Used  to  Gather  the  Data 

Boali  is  located  100  km  northwest  of  Bangui,  the  capital  of  CAR. 
This  field  study  was  done  on  24/25  June  2015.  We  used  the  basic  tools  of 
the  geologist:  a compass,  camera,  hammer,  bag,  notebook,  and  pencil.  Out- 
general approach  is  based  on  the  work  of  Cornacchia  and  Giorgi  (1986).  As 
noted  by  Djebebe-Ndjiguim  (2013)  the  amount  and  density  of  the 
vegetation  made  it  very  difficult  to  search  for  significant  outcrops.  The 
authors  followed  two  protocols  set  by  previous  researchers. 

The  first  protocol  we  followed  was  that  of  Biandja  (1988).  His  work 
was  carried  out  largely  in  the  Bogoin  northern  region.  In  his  lithological 
description  he  was  able  to  list  petrographic  features  consisting  of  lateritic, 
ferruginous,  and  conglomeratic  blocks  for  recent  formations.  They 
contained,  on  average,  quartzite,  white  quartzite,  sandstone  quartzite  for 
covering  the  proterozoic  formation;  meta-volcano  sedimentary,  ferruginous 
quartzite,  gneiss,  Amphibolites,  meta-volcanic  basic  to  ultra-basic  schist, 
and  Metarhyolitoids  (meta-volcanic  acid)  for  the  base  formations  of 
metamorphic  rocks.  For  the  intrusions  Biandja  distinguished  many 
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characteristics  of  crystalline  Granitoid  intruding  porphyroides  granites  from 
the  base. 

The  second  protocol  we  followed  was  that  of  Poidevin  (1991).  His 
work  was  also  earned  out  north  of  Bogoin.  He  identified  different 
petrographic  characteristics  and  classified  them  by  stratigraphic  unit  (as  U/? 
where  n is  a number  1 to  4).  His  four  classifications  are:  Andesite  in  pillow- 
lavas  and  chlorite  amphibolites  for  the  main  basalt  unit  (Ul);  Para 
amphibolites,  meta-rhyolites,  with  greywacke,  feldspathic  quartzite  to 
amphiboles  for  the  intermediate  unit  (U2);  the  greenstone  and  many  pillow- 
lavas  for  the  upper  unit  (U3);  and  Itabirite  (U4).  In  addition  to  his  four 
stratigraphic  units,  he  also  revealed  the  existence  of  geological  formations 
of  regional  importance  such  as  the  granitoids  and  the  series  of  schisto- 
quartzitic  rocks. 


The  Geologic  Data 

We  studied  a variety  of  rocks  types:  plutonic;  sedimentary; 
metamorphic;  and  deformations  of  rocks.  In  general,  the  extent  and  density 
of  the  local  vegetation  made  it  very  difficult  to  search  for  significant 
outcroppings.  (Djebebe-Ndjiguim  2013).  We  will  consider  the  variety  of 
rocks  type  by  type. 

Plutonic  rocks 

A pluton  is  a body  of  intrusive  igneous  rock  (called  plutonic  rock) 
that  is  crystallized  from  magma  slowly  cooling  below  the  surface  of  the 
Earth.  In  this  category  we  studied  two  types:  quartz  veins  and  Granitoids. 

Quartz  vein  (lode):  There  are  two  types  of  quartz  veins  in  the  study  sector: 
metamorphic  formations;  and  rock  crystal  veins  located  in  the  sandstone. 
Quartz  veins  are  not  barren  of  mineralized  rock  crystals.  And  so  in  these 
veins  we  noted  the  presence  of  some  minerals,  such  as  emerald  and  gold, 
due  to  the  movement  of  warm  waters  (Comacchia  et  al.  1985). 

In  the  greenstone  belt  toward  the  vein  wall  there  are  altered 
Amphibolites  in  the  chlorite-schist.  According  to  Cornacchia  et  al.  (1985), 
quartz  veins  containing  rock  crystals  are  found  in  the  sandstone.  These  veins 
continue  through  to  the  quartz  veins  found  in  the  metamorphic  rocks.  They 
originate  in  the  emanation  from  granite  and  are  mineralized  rock  crystal. 
The  veins  occur  from  30°  N to  the  south  with  a thickness  of  15  centimeters 
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to  5 meters  and  orient  North  90°  to  105°.  They  include  some  geodes  in 
which  beautiful  quartz  crystals  have  developed.  The  crystals  have  a 
thickness  of  at  least  30  millimeters.  They  can  reach  1.5  m thick  in  some 
geodes.  Across  the  rock  outcroppings  in  the  direction  N 30°  they  form  solid 
blocks  of  milky-white  appearance  and  are  poly  fractured. 

During  our  field  observations  we  spotted  four  levels  of 
implementation  veins  in  the  quartz  downstream  of  the  dam  at  the  Mbi,  and 
even  more  implementation  veins  next  to  the  road  to  Bossemmbele.  These 
are  the  extension  of  those  downstream  of  the  dam.  The  seams  are  flush  to 
both  sides  of  the  hill  overlooking  the  dam.  Some  veins  fold  into  a semicircle 
under  the  mast  of  the  town’s  police  station  and  also  in  the  stone  quarry. 

Granitoids:  Granitoids  are  plutonic  rocks  that  are  poor  in  silicon  dioxide 
(SiCh).  They  are  designated  in  the  upper  part  of  the  table  of  the  international 
classification  of  streckeizen.  In  our  region  of  Boali  there  are  diorites, 
granodiorites,  granites  of  Mbi,  and  granites  of  Bolen.  We  observed  that 
granitoid  outcroppings  in  the  region  cover  a very  large  area.  Although  grey 
in  appearance  these  rocks  sometimes  have  alternating  beds  of  dark 
ferromagnesian  amphibole  and  biotite  and  clear  beds  (quartz,  feldspar,  and 
muscovite). 

The  granitoids  of  Mbi  orient  30°  N dipping  70°  W.  They  are 
traversed  by  quartz  aplite  and  pegmatite  veins.  These  formations  are 
subdivided  into  granite,  matching  granite,  and  orthogneiss. 

Granite  of  Mbi  - Granite  is  a fully  crystalline  rock.  Minerals  are  on  average 
2 to  5 mm  in  size  about  the  size  of  a grain  of  wheat  (granite  comes  from  the 
Latin  granurn  = grain).  They  contain  three  essential  minerals:  quartz,  alkali 
feldspar  (orthoclase  and  microcline),  and  plagioclase  combined  with  mica 
(biotite  and  muscovite).  The  quartz  comes  in  a grayish  color  surrounding 
other  crystals.  Its  appearance  is  that  of  salt  but  with  a bold  loamy  appearance 
as  if  it  burst  out  of  the  rock.  In  the  region  of  Mbi  the  quartz  has  a conchoidal 
fracture.  The  alkali  feldspars  have  variable  colors  (white,  pink,  red)  and  are 
twin  Karlsbad  (the  crystal  is  alternately  brilliant  and  dull).  Biotite  occurs  in 
black  strips  some  with  a golden  luster,  with  cleavages  or  cleavage  lines. 
This  intrusive  massif  has  lagged  behind  the  plate  tectonics.  It  is  late  granite, 
very  marked,  and  located  on  the  left  bank  of  Mbali  in  our  study  area.  This 
massif  has  a grainy  central  facies  with  large  elements  of  alkaline  feldspar, 
very  rich  in  feldspar,  and  with  very  fine  grain  borders.  It  manifests  itself  in 
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the  landscape  by  significant  outcroppings  and  can  be  observed  in  the  stone 
quarry  without  major  difficulty. 

Granodiorites:  Granodiorites  have  a constitution  nearly  that  of  granite; 
their  silica  content  can  be  as  strong  as  that  in  granite  but  contains  more 
plagioclase  feldspar  than  orthoclase  feldspar.  Common  rock  “granite”  can 
be  distinguished  from  granodiorites  by  carefully  considering  their  feldspar. 
Granodiorites  of  the  Boali  region  have  micro-fractures  that  allow  the 
circulation  of  fluids.  There  is  a possibility  of  finding  gold  and  pyrite.  The 
presence  of  the  epidote  gives  the  rock  its  green  color.  This  epidotisation  is 
due  to  the  alteration  of  the  potassium  in  the  feldspar.  The  outcropping  is 
plural-kilometric  with  an  intrusion  of  the  dolerite  and  granites.  They  are 
dated  to  2.1  Ga.  (Poidevin  and  Pin  1986) 

Diorites:  Diorite  is  an  intrusive  igneous  grainy  rock  with  a silica  deficiency 
(less  than  20%);  therefore,  it  does  not  contain  free  quartz.  It  is  principally 
composed  of  the  minerals  plagioclase  feldspar  (typically  andesine),  biotite, 
hornblende,  and/or  pyroxene.  Feldspar,  generally  grayish,  helps  to  give  the 
rock  a dark  color.  Diorites  are  intruding  amphibolites  and  are  contiguous 
with  the  granodiorites  of  Mbi. 

Dolerite:  Dolerites  are  intermediate  rocks  that  fall  between  grainy  gabbros 
and  basalts  with  microlitic  grain  that  is  visible  under  a microscope  and 
shows  sub-hedral  plagioclase  laths  molded  by  interstitial  pyroxene.  They 
are  generally  massive  and  compact  with  a color  ranging  from  black  to  grey 
but  more  often  dark  green.  We  saw  three  hills  of  dolerite  in  the  intruding 
granodiorites  in  a nearby  outcropping  of  granite. 

Contact  areas:  The  Bangui-Boali  section  shows  several  contact  areas 
characterized  by  vein  crates  between  sedimentary  rocks  and  metamorphic 
rocks.  About  1 00  km  from  the  first  dam  to  the  north  of  Boali  we  find  a 
contact  between  Amphibolites  and  sandstones.  The  contact  is  characterized 
by  a puckered  quartz  hill.  At  123  km  another  contact  is  characterized  by  a 
type  of  vein  that  is  favored  by  the  hydrothermalism  between  granitoids  and 
massive  Amphibolites;  there  is  another  contact  with  upright  schistosity 
(sub-vertical  N 60°). 
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Sedimentary  rocks 

Sedimentary  rocks  are  rocks  that  are  formed  by  the  deposition  of 
material  at  the  Earth’s  surface  and  within  bodies  of  water.  Sedimentation  is 
the  collective  name  for  processes  that  cause  mineral  and/or  organic  particles 
(detritus)  to  settle  and  accumulate  or  minerals  to  precipitate  from  a solution. 
Sandstone  is  a sedimentary  rock.  It  is  a consolidated  rock  that  belongs  to 
the  class  of  arenite  rocks  that  have  a grain  size  between  0.0625  and  1 mm. 

Thus  we  can  distinguish  between  quartz  sandstones,  where  a 
microcrystalline  material  persists  between  the  grains  of  quartz,  and  quartzite 
sandstone,  where  grains  are  linked  to  each  other  following  a secondary 
pathway  that  depends  on  the  cement.  They  are  located  south  of  the 
Kassango  area  and  belong  to  the  Oolitic  sandstone  of  the  Boali  series  (see 
Photo  1C  which  shows  the  sandstones  of  Boali  falls).  The  corresponding 
features  are  homogeneous  fine-grained  quartzose  sandstones  (with  clay 
cement  in  the  south-west  that  changes  to  siliceous  cement  to  the  south  and 
south-east).  Quartzite  occurs  on  the  road  to  the  city  in  the  stratigraphic 
extension  falls  downstream  from  the  third  Boali  falls.  The  sandstones  are 
grayish  and  friable  rocks  whose  diamante  detrital  minerals  are  amorphous 
quartz  grains  often  recrystallized  as  anhedral  feldspar.  Observation  with  a 
microscope  reveals  rare  biotite  lamellae  and  a few  fine  flakes  of  muscovite. 
Boali  sandstones  are  the  equivalent  of  those  of  Fatima,  a district  located  in 
the  Bangui  capital  of  the  CAR. 

Metamorphic  rocks 

Metamorphic  rocks  arise  from  the  transformation  of  existing  rocks, 
in  a process  called  metamorphism,  which  means  “change  in  form”.  We 
found  four  types  of  metamorphic  rocks  in  our  study  area:  Schist, 
Amphibolites,  Itabirite,  and  Gneiss. 

Schist:  Schists  are  characterized  by  medium  to  large,  flat,  sheet-like 
grains  with  a preferred  orientation.  The  outcroppings  form  in  slabs  on  the 
bed  of  the  Kassango  at  the  roadside  and  are  often  interstratified  with  the 
sandstones.  Of  greenish  hue  this  rock  has  a sub-vacuolar  structure 
throughout.  It  presents  numerous  vacuoles  and  therefore  it  is  strongly 
schistose.  It  fits  into  beds  that  are  clear  of  recrystallized  quartz  and  chlorite 
and  has  dark  beds  of  rare  sericite  altered  biotite.  We  find  these  in  the  region 
of  Boali,  and  we  find  this  same  shale  on  the  road  to  Damara.  These  are  rich 
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in  mica  and  nodules  and  are  very  crumpled;  this  is  the  schist  of  Boali,  the 
equivalent  of  the  Fatima  shale  that  belongs  to  the  series  of  Bangui,  which 
are  above  the  Yangana  shale. 


Photo  1 : sandstones  of  Boali  falls 

Amphibolites:  Amphibolites  are  dark  green  rocks  consisting  mainly  of 
amphibole  crystals  more  or  less  ordered  along  the  planes  of  schistosity.  We 
can  distinguish  laterized  amphibolites,  layered  amphibolites,  and  massive 
amphibolites. 

Highly  altered  and  chloritized  laterized  amphibolites  are  found  in 
the  area  of  the  lakes  of  the  crocodiles  in  a stone  quarry  about  100  km  away 
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from  the  laterites  on  the  highway.  Layered  amphibolites  are  amphibolites 
that  have  a banded  texture  characterized  by  alternating  feldspathic  quartz 
beds  and  detachment  beds.  The  hyper-fractured  veins  come  within  1 km  of 
the  Bogoin  village  (village  Bobissa).  Massive  amphibolites  are  mottled  with 
Granoblastic  massive  rocks.  The  fine-grained  rocks  show  a discontinuity  in 
their  arrangement.  The  dark  minerals  are  dominant  with  a cleavage  of 
amphibole  and  biotite.  There  are  also  some  rare  glitters  of  muscovite.  The 
massive  amphibolites  stretch  from  the  Bogoin  village  to  where  they  make 
contact  with  the  granodiorites. 

Itabirite : The  itabirites  are  quartzite  ferruginous  rubane.  The  outcropping  is 
in  a kilometer  wide  band.  These  are  generally  quartz-rich  rocks  occurring 
with  magnetite  and  often  oligiste.  This  last  mineral  concentrates  in  massive 
structures  around  the  quartz  veins  crossing  the  Banded  Iron  Formation 
(BIF)  where  the  banding  is  very  marked;  there  is  a layout  of  dark  magnetite 
beds  alternating  with  beds  of  clear  quartzo-feldsparthic  rock. 

Gneiss:  The  gneisses  are  medium-grained  or  coarse  rocks  about  1 mm  to  1 
cm  in  size.  They  often  have  net  foliation  characterized  by  beds  of  generally 
dark  hue,  rich  in  minerals  (mica,  amphibole),  and  alternate  with  clear  beds 
of  ferromagnesian  (white  grey,  pink)  quartz  and  feldspar  visible  to  the 
naked  eye.  We  noted  the  presence  of  the  orthogneisses,  which  are  rocks  that 
form  a contact  between  amphibolites  and  granodiorites  on  one  side  and  form 
a contact  between  the  granite  and  granodiorites  on  the  other  side. 

Other  metamorphic  rocks 

The  quartzite  and  muscovites  rocks  occupy  the  eastern  part  of  the 
region.  Shale  appears  in  slabs  on  the  bed  of  the  Ngalou.  Chloritoschistes 
and  schist  outcroppings  occur  in  the  region  of  Bogoin.  Orthogneisses 
occupy  the  southern  part  of  the  region  of  Bogoin.  The  southern  region  of 
the  Kolango  is  characterized  by  a lower  relief  that  is  very  soft  sided  in  its 
uppermost  part.  On  the  sides  of  the  rocky  massive  benches,  block  elements, 
and  fractured  outcroppings  we  can  distinguish  massive  metabasalts  in  the 
pillow  lavas  and  metabasalts  in  the  stringers  of  intruding  quartz. 
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Deformations 

Deformation  takes  an  object  from  its  initial  state  to  its  final  state  by 
mass  transport  (translation,  displacement,  rotation,  and  by  internal 
deformation).  The  deformed  object  is  defined  by  its  dimensions. 

Stratification  is  one  form  of  deformation.  Bedding  planes  illustrate  the  style 
of  the  planar  structural  element.  These  were  initially  roughly  flat,  horizontal 
surfaces.  Their  characteristics  and  variations  are  an  imprint  of  deformations 
that  have  been  imposed  by  the  sedimentary  terrain  since  their  deposition. 
This  stratification  is  observed  in  the  sandstone  outcropping.  At  the  entrance 
to  the  falls  of  Boali  we  observed  the  stratification  cross  the  sandstone  (See 
Photo  IE). 

Geological  foliation  (metamorphic  arrangement  in  layers)  with 
medium  to  large  grained  flakes  in  a preferred  sheetlike  or  planar  orientation 
is  called  schistosity.  The  plane  of  the  schistosity  is  called  S.  In  formations 
containing  more  competent  levels,  stretching  leads  to  socking  which  is  to 
say  leads  to  the  segmentation  of  the  most  competent  object  into  fragments 
and  socks.  Photo  ID  illustrates  deformations  characterized  by  boudinage, 
folds,  faults,  and  shears. 

• Boudinage  is  a term  used  in  geology  to  indicate  structures  formed 
by  extension  (where  a rigid  body  is  deformed  often  into  a sausage 
or  boudin  like  shape). 

• A fold  is  a permanent  waveform  deformation  in  layered  rock  (the 
rocks  bend  or  twist).  It  occurs  when  one  or  a stack  of  originally  flat 
surfaces  (such  as  sedimentary  rock)  are  permanently  bent  or  curved. 

• A fault  is  a fracture  in  the  bedrock.  They  are  breaks  accompanied  by 
the  relative  movement  of  two  components.  The  movement  can  be 
vertical  (vertical,  oblique,  fault  normal  or  reverse)  or  horizontal 
(strike-slip  or  shear). 

• A shear  is  the  response  of  a rock  to  deformation  usually  by 
compression.  The  shear  can  be  emphasized  by  certain  minerals. 

We  essentially  observed  the  schist  as  shears  and  lineaments.  These 
are  break  planes  that  are  accompanied  by  the  relative  movement  of  two 
components  which  show  the  hang  of  the  faults.  Lineaments  are  mineral 
lineations  that  occur  during  metamorphic  crystallization. 
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In  our  study  area  there  is  a boudin  fold  ply  spilled  to  the  NW.  It 
consists  essentially  of  anhedral  quartz  crystals  a centimeter  in  size.  It  is 
molded  into  the  clay  schist  (shale)  and  found  at  the  roadside  in  Kassango. 
Quartz  flanges  are  located  in  the  clay  schist  of  Kassango.  They  can  be  found 
at  the  level  of  the  boudin  folds.  On  the  road  to  the  town’s  police  station  we 
find  a crease  spilled  in  sandstone.  The  itabirite  are  also  very  creased.  The 
wrinkles  are  crooked  with  a very  upright  fold  axis.  Under  the  mast  of  the 
police  station  is  a surrounding  concentric  fold  with  a diameter  of  40  cm. 
The  fold  shown  in  photo  ID  is  observed  in  the  clay  schist  (shale)  of 
Kassango.  Finally,  we  see  a deformation  characterized  by  a fold  slumping 
downward.  Accompanying  the  schistosity  and  the  boudin  is  a tangential 
tectonic  surface  with  direction  S-SE  toward  N-NW.  A second  deformation 
is  a tangential  tectonic  surface  contrary  to  the  first.  It  runs  NW-SE.  This 
tectonic  surface  is  confined  by  the  mega  fold  conic  running  N-NW.  A 
tectonic  surface  relates  to  the  structure  of  the  Earth’s  crust  and  the  large- 
scale  processes  which  take  place  within  it. 

Shears : Sinistral  and  dextral  shears  were  observed  at  the  stone  quarry  (S2, 
See  Figure  2).  They  form  a corridor  of  sinistral  shear  5 m wide  for  the  S2 
shears  and  fall  155-45°  SW.  The  basal  formation  of  the  stone  quarry  shows 
deformation  bands  approximately  60  m wide.  We  found  that  the  dextral 
shear  (S2,  Figure.  2)  was  hardly  observable.  On  the  other  hand,  the  S2 
shears  are  very  representative  of  the  class. 


Figure  2:  The  center  insert  shows  the  dextral  Shears  in  the  area.  The  left 
and  right  sides  of  the  figure  show  the  sinistral  shear,  S2 
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Lineaments:  The  lineaments  are  mineral  lineations  during  metamorphic 
crystallization.  We  observed  lineaments  in  the  sandstone  towards  the  falls. 
The  lineament  runs  N 120-35°  S.  It  crosses  all  fractures.  We  observed  in  the 
dolerites  two  families  of  lineaments:  one  in  direction  N 135°  sub-vertical; 
and  the  other,  south-facing  N 25°  with  a dip  of  60°.  There  are  small 
intercalations  of  gneiss  in  the  outcroppings  of  dolerite.  The  thickness  of 
these  dolerites  can  reach  40  m.  Diabase  dykes  continue  to  the  top  of  the  hill. 
All  these  formations  in  the  sector  are  affected  by  brittle  deformation  which 
appears  here  as  faults.  The  faults  are  the  fractures  in  the  bedrock.  They  are 
breaks  accompanied  by  the  relative  movement  of  two  components.  The 
movement  can  be  vertical  (vertical,  oblique,  fault  normal  or  reverse)  or 
horizontal  (strike-slip  or  shear).  The  fault  of  Boali  is  a normal  fault  (see 
Photo  1 E.  F)  corresponding  to  Figure  3,  which  shows  a normal  fault.  These 
faults  have  been  found  in  the  sandstone  in  front  of  the  police  station  (in  the 
main  city).  They  include  three  (3)  series  of  fracturing.  FI  and  F2  in  direction 
N 0+/-100  they  have  an  embedding  of  60  -75°  E,  sub-vertical;  and  F3  in 
direction  N 145  +/-  10  they  are  sub-verticals.  See  Table  1 (in  front  of  the 
Internet  service  provider  for  Boali,  in  the  main  city). 

In  addition,  the  brittle  tectonic  of  the  Mbi  sector  highlights  four 
major  series  of  faults:  These  series  of  faults  run:  N45°  - N 50°;  N 80°  - N 
100°;N130°-N  140°,  corresponding  to  F3;  and N160° -N  175°  (see  Table 
1 (Mbi  sector)). 


Fig.  3:  the  shear  dextral  and  normal  fault  corresponding  to  the  study  areas 

observed 
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Locality 

Outcropping 

Fault 

Direction  and  dip 

In  front  of  the 
Internet  service 
provider  for  Boali 

sandstone 

faults  F 1 and 

F2 

NO -75  E 

N 10.  sub-vertical 

N 145 

Sector  of  Mbi 

fault 

N45-N50 

N80-N100 

N130-N140 

N160-N175 

Table  1:  faults  family  observed  in  the  study  areas. 


Photo  2:  The  basic  faults  and  veins 


Timeline  for  the  Faults:  Photo  2 shows  the  various  fault  lines  we  observed. 
First  FI  was  established  with  a filling.  Then  a second  fault  F2  parallel  to  FI 
and  included  in  FI  was  established  with  a new  filling  of  a quartz  vein.  A 
third  fault  F3  (also  found  in  the  area  of  Mbi)  oblique  to  FI  and  F2  in  the 
direction  N 145  sub-vertical,  and  joins  with  FI  and  F2,  accompanied  by  its 
vein  filling.  Finally  a mineral  lineament  of  direction  and  dip,  N 120  - 35  S 
(see  Photo  2),  has  complex  features  that  indicate  it  is  recent. 

Structurally,  D1  and  D2  deformations  having  contrary  motion  show 
two  tectonic  movements,  namely:  tectonics  of  the  Ebumean  age  (2.1  Ga) 
responsible  for  thrusting  from  S-E  to  N-W  and  a second  pan-African  age 
tectonics.  We  therefore  suggest:  a dating  of  metamorphic  and  sedimentary 
rocks  to  coincide  with  the  chronology  of  events;  an  elemental  geochemistry 
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to  trace  for  a Concordia  diagram  to  have  both  the  age  of  the  formation  and 
the  age  of  metamorphism. 

On  the  regional  level,  there  is  a corridor  of  grid  cut  faults  towards 
the  north  of  the  CAR  (Cornacchia  and  Dars,  1983)  in  the  direction  N 70° 
and  N 40°.  This  is  the  area  of  strike-slip  faults  of  the  Oubanguiides. 

Other  setback  faults  with  a direction  N 130°  to  N 160°  towards  the 
sinistral  fault  affect  all  the  structural  units  (Poidevin,  1991).  These  major 
setbacks  date  from  the  pan-African  phase. 

As  we  get  closer  to  our  study  area  we  find  different  faults  in  the 
major  setback  of  the  pan-African  phase  described  by  Rolin  (1992).  There 
are  two  families  of  faults  (N  45°  and  N 80°)  that  correspond  to  the  dextral 
grid  N 70°  and  N 40°  of  the  pan-African  phase.  There  are  flaws  running  N 
130°  - N 140°  corresponding  to  the  sinistral  transcurrent  N 130°  of  pan- 
African  water.  Similarly,  faults  running  N 160°  - 175°  can  be  classified 
within  the  family  of  sinistral  offsets  at  N 160°  of  the  pan-African  phase.  We 
found  two  families  of  shear  flaws  including  the  first  sinistral  flaw  (N  130°) 
and  the  second  dextral  flaw  (N  45°)  as  we  have  described.  These  two  flaws 
affect  both  formations  of  Mbi. 


Conclusion 

The  CAR  is  a landlocked  country  in  Central  Africa.  It  is  divided  into 
16  administrative  prefectures,  two  of  which  are  economic  prefectures,  and 
one  an  autonomous  commune;  the  prefectures  are  further  divided  into  71 
sub-prefectures.  Geologically  CAR  is  a country  rich  in  mineral  resources. 
Our  study  is  located  in  the  region  of  Boali,  which  is  a town  located  in  the 
prefecture  of  Ombella  M’poko.  Boali  is  on  the  National  Road  1 (RN1)  about 
100  km  northwest  of  the  Bangui  capital  of  the  CAR. 

For  this  work  we  focused  on  the  protocols  set  by  previous  studies  of 
the  geology  formation  in  the  Boali  region.  We  also  considered  studies  of 
the  region  by  others.  We  include  not  only  new  information  on  the  geologic 
formation  of  the  Boali  region,  but  also  discuss  the  effect  that  non- 
exploitation of  the  granitoids  in  the  area  has  on  the  region.  The  Boali  region 
has  an  important  reserve  of  granitoids,  which  form  outcroppings  and 
constitute  the  base  of  the  region.  Geologically  granitoids  consist  essentially 
of  quartz  and  feldspar  (a  ferromagnetic  material).  In  region  of  Boali  the 
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crystals  form  in  veins  which  intrude  into  the  sedimentary  and  metamorphic 
formation.  These  rocks  are  important  and  useful  for  economic  and  social 
development. 

In  the  region  of  Boali  most  mining  is  done  by  artisanal  gold  miners. 
Granitoids  have  never  interested  the  people  in  the  Boali  region.  We  note  that 
in  the  region  of  Boali,  none  of  the  local  residences  are  made  with  material 
from  the  granitoids.  Yet  granitoids  are  needed  for  the  infrastructure.  In 
general  for  the  CAR  and  in  particular  for  the  Boali  region,  granitoids  are 
wealth  ignored  and  abandoned.  If  the  granitoids  are  exploited  in  the  region 
of  Boali,  then  they  can  contribute  to  the  buildings;  for  example,  tiles  can  be 
made  of  granitoid.  Most  products  made  with  granitoids  are  the  hardest  of 
materials,  which  offer  a luxury  in  comparison  to  marble  tiles. 

In  the  petrographic  plane  the  sedimentary  rocks  are  quite  fractured 
and  mingle  with  quartz  veins,  metamorphic  and  magmatic  rocks.  Magmatic 
differentiation  which  led  to  the  establishment  of  the  Diorites,  the 
Granodiorites  and  Granites  as  well  as  intrusion  of  intermediate  rocks 
(Dolerite)  shows  a bimodal  magmatism  confirmed  by  the  presence  of 
Granitoids  and  Ultrabasites. 

The  quartz  from  quartz  veins  can  be  made  into  glass  for  the 
manufacture  of  laboratory  equipment  such  as:  burettes,  beakers,  and  test 
tubes,  which  are  urgently  needed  in  the  Central  African  Republic.  Quartz 
veins  are  an  asset  for  developing  jewelry  workshops.  We  use  the  quartz 
from  quartz  veins  in  the  manufacture  of  silicon  pads,  integrated  circuits  for 
audio  and  video  devices,  microprocessors  for  computers,  solar  panels,  and 
electric  watches.  It  is  also  used  in  gas  and  electronic  lighters.  Furthermore 
it  can  be  used  in  construction,  for  coating  houses,  pavements,  and  layering 
of  load-bearing  seats. 

Unfortunately  for  the  CAR  in  general  and  for  the  region  in  particular 
these  rocks  have  not  been  exploited.  We  do  note,  however,  that  it  is  a 
complex  issue.  Consequently  the  granitoids  have  not  contributed  to  the 
social  development  in  the  CAR  and  Boali.  In  Boali  unemployed  youth 
might  find  productive  work  by  artisanal  mining  of  these  reserves.  The 
government  might  consider  implementing  a policy  for  medium  and  small 
business  development  in  the  areas  mentioned  above  that  might  lower  the 
high  percentage  of  unemployment  with  all  its  consequences,  not  least  of 
which  are  prostitution  and  violence.  In  our  investigation  in  the  region  of 
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Boali,  we  conclude  that  currently  the  most  important  human  activities  are 
the  individual  artisanal  agriculture  production,  fishing,  and  hunting  to 
satisfy  the  daily  needs. 
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Abstract 

On  1977  August  15,  the  Ohio  State  University  Radio  Observatory 
detected  a strong  narrowband  signal  northwest  of  the  globular  star  cluster 
M55  in  the  constellation  Sagittarius  (Sgr).  The  frequency  of  the  signal, 
which  closely  matched  the  hydrogen  line  (1420.40575 1 77  MHz),  peaked 
at  approximately  23:16:01  EDT.  Since  then,  several  investigations  into 
the  “Wow”  signal  have  ruled  out  the  source  as  terrestrial  in  origin  or 
other  objects  such  as  satellites,  planets  and  asteroids.  From  1977  July  27 
to  1977  August  15,  comets  266P/Christensen  and  P/2008  Y2  (Gibbs) 
were  transiting  in  the  neighborhood  of  the  Chi  Sagittarii  star  group. 
Ephemerides  for  both  comets  during  this  orbital  period  placed  them  at 
the  vicinity  of  the  “Wow”  signal.  Surrounding  every  active  comet,  such 
as  266P/Christensen  and  P/2008  Y2  (Gibbs),  is  a large  hydrogen  cloud 
with  a radius  of  several  million  kilometers  around  their  nucleus.  These 
two  comets  were  not  detected  until  after  2006,  therefore,  the  comets 
and/or  their  hydrogen  clouds  were  not  accounted  for  during  the  “Wow” 
signal  emission.  Because  the  frequency  for  the  “Wow”  signal  fell  close 
to  the  hydrogen  line,  and  the  hydrogen  clouds  of  266P/Christensen  and 
P/2008  Y2  (Gibbs)  were  in  the  proximity  of  the  right  ascension  and 
declination  values  of  the  “Wow”  signal,  the  comet(s)  and/or  their 
hydrogen  clouds  are  strong  candidates  for  the  source  of  the  1977  “Wow” 
signal. 


Introduction 

On  1977  August  15  at  approximately  23:16:01  EDT,  the  Big  Ear  Radio 
Telescope  at  The  Ohio  State  University  detected  an  intermittent  narrowband 
radio  signal  (<10  KHz)  northwest  of  the  globular  star  cluster  M55  in  the 
constellation  of  Sagittarius  (Sgr)  [ 1 ] [2]  and  approximately  2.5°  south  of  the 
Chi  Sagittarii  star  group  [5].  Determining  the  exact  location  where  the  72- 
second  signal  originated  from  in  the  sky  was  problematic  because  the 
telescope  used  two  separate  feed  horns  to  search  for  radio  signals  [5],  The 
data  from  the  signal,  moreover,  were  processed  in  such  a way  that  it  was 
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difficult  to  establish  which  of  the  two  horns  detected  the  signal  [2].  There 
are,  therefore,  two  possible  right  ascension  values  for  the  source  of  the 
alleged  extraterrestrial  intelligence  signal:  19h22m24.64s  ± 10s  and 
19h25m17.01s±  10s  and  the  declination  was  determined  to  be  -27°03'  ± 20 
(Table  1)  [2].  Two  similar  values  for  the  signal’s  frequency  were  assigned: 
1420.356  MHz  and  1420.4556  MHz.  These  two  frequencies  fall  close  to  the 
hydrogen  line,  which  is  1420.40575177  MHz  [6], 

Table  1 : Right  Ascension  and  Declination  Equinox 
Conversions;  and  Galactic  Coordinates  for  the  “Wow”  Signal 
(Source:  Ohio  State  University  Big  Horn  Report) 

Declination Positive  Horn Negative  Horn 

B1950.0  Equinox  -27°03'±  20’  19h22m24.64s  ± 10s  19h25m17.01s±  10s 

J2000.0  Equinox  -26°57'±  20’  19h25m31s±  10s  19h28m22s±  10s 

Galactic  Latitude  N/A  -18d53.4m±  2.1m  -19d28.8m±  2.1m 

Galactic  Longitude  N/A  1 ld39.0m±  0.91TI  1 ld54.0m±  Q.9m 

Previous  Investigations  by  the  Astronomical  Community 

Subsequent  research  to  re-detect  and  identify  the  “Wow”  signal  by 
The  Ohio  State  University,  the  Very  Large  Array,  and  The  University  of 
Tasmania’s  Mount  Pleasant  Radio  Observatory  were  null.  After  a search  of 
the  area  where  the  “Wow”  signal  was  detected  (Table  2),  the  Very  Large 
Array  and  The  Ohio  State  University  Radio  Observatory  concluded  there 
was  strong  evidence  against  the  origin  of  the  source  as  terrestrial  in  nature 
or  objects  such  as  planets,  man-made  spacecraft,  artificial  satellites,  and 
radio  transmissions  emanating  from  Earth.  Furthermore,  the  Very  Large 
Array  proposed  the  intermittent  “Wow”  signal  matched  the  signature  of  a 
transiting  celestial  source  [5],  while  The  University  of  Tasmania  suggested 
the  signal  was  moving  with  the  source  of  the  hydrogen  line  [7], 

Anatomy  of  a Comet  and  Its  Hydrogen  Cloud 

The  distinctive  parts  of  a comet  include  the  nucleus,  coma,  dust  tail, 
ion  tail,  and  a hydrogen  cloud.  Moderately  active  comets  are  surrounded  by 
a widespread  cloud  of  neutral  hydrogen  atoms  [4],  The  hydrogen  is  released 
from  the  comet  when  ultraviolet  radiation  from  the  Sun  splits  water  vapor 
molecules  released  from  the  nucleus  of  the  comet  into  the  constituent 
components  oxygen  and  hydrogen  [8],  The  size  of  the  hydrogen  cloud  is 
determined  by  the  size  of  the  comet  and  can  extend  over  100  million  km  in 
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width,  such  as  the  hydrogen  cloud  of  comet  Hale  Bopp  [9].  As  a comet 
approaches  the  Sun,  its  hydrogen  cloud  increases  significantly.  Since  the 
rate  of  hydrogen  production  from  the  comet’s  nucleus  and  coma  has  been 
calculated  at  5 x 102g  atoms  of  hydrogen  every  second,  the  hydrogen  cloud 
is  the  largest  part  of  the  comet  [9].  Moreover,  due  to  two  closely  spaced 
energy  levels  in  the  ground  state  of  the  hydrogen  atom,  the  neutral  hydrogen 
cloud  enveloping  the  comet  will  release  photons  and  emit  electromagnetic 
radiation  at  a frequency  along  the  hydrogen  line  (1420.40575177  MHz) 
[10]. 


Date  of  Search RA DEC 

VLA  25  SEP  1995  19h21m28.1s  to  19h25m48s  -27°41  to  -26°  18 

07  MAY  1996  19h21m28.1s  to  19h25m48s  -27°41  to  -26°  18 


Ohio  State  U.  05  OCT  1998 
09  OCT  1998 
9-10  APR  1999 
17-18  MAR  1999 
20-21  MAR  1999 
22-23  MAR  1999 


19h22m22s 

-27°03 

1 9h25m 12s 

-27°03 

1 9h25m  1 2s 

-26°48 

19h22m22s 

-27° 18 

1 9h25m  1 2s 

-27° 18 

1 9h22m22s 

-26°48 

Table  2:  Right  Ascension  and  Declination  Observations  Grid  by  the  VLA  and  Ohio  State 

University  (Source:  VLA  and  Ohio  State) 


Comets  266P/Christensen  and  P/2008  Y2  (Gibbs) 

From  1977  July  27  to  1977  August  15,  Jupiter-family  comets 
266P/Christensen  and  P/2008  Y2  (Gibbs)  were  transiting  in  the  vicinity  of 
the  Chi  Sagittarii  star  group  and  significantly  close  to  the  source  of  the 
“Wow”  signal  (Figure  1)  [3][  1 1 ].  Of  significance  to  this  investigation,  the 
purported  source  of  the  “Wow”  signal  was  fixed  between  the  right  ascension 
and  declination  values  (Table  3)  of  comets  266P/Christensen  and  P/2008 
Y2  (Gibbs).  On  their  orbital  plane,  moreover,  266P/Christensen  was  3.8055 
AU  from  Earth  and  moving  at  a radial  velocity  of +13.379  km/s;  and  P/2008 
Y2  (Gibbs)  was  4.406  AU  from  Earth  and  moving  at  a radial  velocity  of 
+ 19.641  km/s  (Figure  2)  [3]. 
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Figure  1:  Location  of  Comets  266P  and  P/2008  from  1977  July  27  to  1977  August  15. 
(Source:  The  Minor  Planet  Center  and  NASA  JPL  Small  Body  Database)  [11]. 
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Table  3:  Right  Ascension  and  Declination  Values  for  Comets  P/2008  Y2 
(Gibbs)  and  266P/Christensen  (Source:  Minor  Planet  Center) 


Date 

RA 

DEC 

P/2008  Y2  (Gibbs) 

27  JUL  1977 

19h28m12s±  10s 

-27°3 1 

01  AUG  1977 

19h25ml  7s  ± 10s 

-27°33 

05  AUG  1977 

19h22m23s±  10s 

-27°35 

15  AUG  1977 

19h16m37s±  10s 

-27°36 

266P/Christensen 

07  AUG  1977 

19h29m47s±  10s 

-25°53 

15  AUG  1977 

19h25m17s±  10s 

-25°58 
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Figure  2:  On  1977  August  15,  comet  266P/Chrislensen  was  3.8055  AU  from  Earth  and 
comet  P/2008  Y2  (Gibbs)  was  4.406  AU  from  Earth  (Source:  JPL  Solar  System  Dynamics 
Database)  [12] 


The  data  regarding  cornets  266P/Christensen  and  P/2008  Y2 
(Gibbs),  therefore,  strongly  suggest  either  comet,  or  both,  could  be  the 
source  of  the  hydrogen  line  signal  detected  by  the  Ohio  State  University  on 
1977  August  15.  Chemicals  in  comets  emit  radio  waves.  The  hydrogen  radio 
waves  from  a comet,  such  as  from  266P/Christensen  and  P/2008  Y2 
(Gibbs),  travel  through  space  akin  to  light.  Therefore,  radio  telescopes, 
including  the  Big  Ear  Radio  Telescope  at  The  Ohio  State  University,  could 
have  intercepted  them.  It  is  noteworthy  to  comment,  moreover,  during 
observations  of  the  area  by  the  Very  Large  Array  and  The  Ohio  State 
University  Radio  Observatory  (from  1995  to  1999),  comet 
266P/Christensen  and  P/2008  Y2  (Gibbs)  were  not  in  the  neighborhood  of 
the  right  ascension  and  declination  values  of  the  “Wow”  signal  (Table  4) 
[5],  thus  the  hydrogen  cloud  from  these  two  comets  would  not  have  been 
detected.  Additionally,  because  the  period  for  comet  266P/Christensen  is 
6.63  years  and  P/2008  Y2  (Gibbs)  is  6.8  years  [3],  their  orbital  period  could 
account  for  why  the  “Wow”  signal  was  intermittent  and  not  detected  during 
subsequent  searches  of  the  area. 
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Conclusions 

There  is  noteworthy  data  to  propose  that  the  hydrogen  signal 
detected  by  the  Big  Ear  Radio  Telescope  at  The  Ohio  State  University, 
specifically  1420.356  MHz  and  1420.4556  MHz,  emanated  from  the  neutral 
hydrogen  clouds  of  comets  266P/Christensen  and/or  P/2008  Y2  (Gibbs). 
There  are,  conversely,  many  unknowns  the  astronomical  community  will 
need  to  address  to  confirm  the  hydrogen  clouds  from  these  comets  were  the 
source  of  the  1977  “Wow”  signal.  To  date,  no  observations  have  acquired 
and  measured  the  size,  mass  and  spectral  signature,  most  critically,  of  these 
two  comets.  Additionally,  in  1977  the  Big  Ear  Radio  Telescope  was 
operating  in  drift  scan  mode.  Consequently,  if  a comet  (or  any  celestial 
object)  was  the  source  of  the  “Wow”  signal,  it  should  have  been  detected  in 
the  trailing  beam  after  detection  in  the  leading  beam  [13].  Comet 
266P/Christensen  will  transit  the  neighborhood  of  the  “Wow”  signal  again 
on  2017  January  25  and  can  be  located  at  19h25m  15.00s  and  declination 
-24°50'  at  a magnitude  of  +23  [3].  On  2018  January  07,  comet  P/2008  Y2 
(Gibbs)  will  also  transit  the  neighborhood  of  the  “Wow”  signal.  Comet 
P/2008  Y2  (Gibbs)  can  be  located  at  right  ascension  19h25m  17.6s  and 
declination  -26°05'  at  a magnitude  of  +26.9  [3].  During  this  period,  the 
astronomical  community  will  have  an  opportunity  to  direct  radio  telescopes 
toward  this  phenomenon,  analyze  the  hydrogen  spectra  of  these  two  comets, 
and  test  the  authors’  hypothesis. 
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Table  4:  Location  of  Comets  266P/Christensen  and  P/2008  Y2  (Gibbs)  During  VLA  and 
Ohio  State  University  Observations  (Source:  The  Minor  Planet  Center) 


Pate 

RA 

PEC 

P/2008  Y2  (Gibbs) 

25  SEP  1995  (VLA) 

1 1 h42m 

+00°22' 

07  MAY  1996  (VLA) 

1 6h 1 1 m 

-32°0T 
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Affine  Geometry,  Planck  Length  and  Cosmic 

Acceleration 

George  L.  Murphy 

Tallmadge,  OH 

Abstract 

In  Ihe  1940s  Schrodinger  developed  a generalization  of  Einstein’s  metric 
gravitational  theory  based  on  a purely  affine  geometry.  Today  there  are  some 
reasons  to  give  this  theory  renewed  attention.  First,  it  is  another  step  along  the 
path  that  Einstein  pioneered  in  abandoning  a priori  assumptions  about  the 
geometry  of  the  world.  Second,  Schrodinger’ s theory  offers  the  prospect  of 
dealing  with  the  breakdown  of  the  metric  concept  at  the  Planck  scale  while 
retaining  the  continuum.  And  third,  the  requirement  that  the  cosmological 
constant  cannot  vanish  in  this  theory  means  that  the  cosmic  acceleration  which 
has  recently  been  discovered  can  be  included  in  a natural  way  with  this 
approach,  and  that  the  problem  of  a large  vacuum  energy  can  be  resolved. 

Introduction 

A scientific  theory  that  does  not  predict  novel  phenomena  or  correlate 
known  facts  better  than  its  competitors  will  be  relegated  to  the  history  of 
science  museum.  It  may,  however,  return  to  active  duty  if  it  helps  to  explain 
new  data.  Today  Schrodinger’ s affine  field  theory  from  the  1940s  deserves 
such  reconsideration. 1 It  may  help  to  explain  cosmic  acceleration  and 
provide  a basis  for  quantized  gravitation  as  a result  of  an  advance  beyond 
metric-based  general  relativity. 

This  theory  received  inadequate  attention,  or  the  wrong  type  of 
attention,  when  it  was  proposed.  Many  physicists  considered  it  to  be  only  a 
variant  of  a theory  Einstein  was  then  developing  in  which  a non-symmetric 
part  was  added  to  the  metric  tensor.2  Today  we  can  see  that  a theory  in  which 
metric  is  a secondary  concept  can  explore  territory  that  is  closed  to  a theory 
in  which  it  is  fundamental. 

When  Schrodinger  proposed  his  theory  many  relativists  viewed  the 
cosmological  term  negatively.  Pauli,  for  example,  rejected  it  because  it 
required  a non-vanishing  cosmological  constant. 3 Now  we  know  that 
cosmic  expansion  is  accelerating  in  a way  that  is  compatible  with  a 
cosmological  term  in  the  gravitational  field  equations.4 
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Attempts  to  extend  general  relativity  like  those  of  Schrodinger  and 
Einstein  were  also  burdened  by  expectations  that  they  could  be  unified  field 
theories  encompassing  all  physical  phenomena.  Einstein’s  hope  that  a 
successful  theory  of  this  type  would  eliminate  what  he  saw  as  unattractive 
features  of  quantum  mechanics  gave  many  physicists  the  impression  that 
the  whole  line  of  work  was  essentially  reactionary.5 

Here  we  eschew  any  expectation  that  an  affine  theory  can,  by  itself, 
provide  a unified  explanation  of  physical  interactions.  While  it  does 
generalize  Einstein’s  theory  of  gravitation,  its  main  interest  is  that  it 
provides  a broader  geometric  framework  for  further  work.  We  will  begin  by 
considering  reasons  for  generalizing  the  geometry  that  Einstein  used  in  his 
1915  theory,  and  then  explore  the  possibilities  connected  with  quantum 
gravity  and  dark  energy. 

Is  Riemannian  Geometry  Necessary? 

The  name  “geometry”,  from  Greek  words  for  “earth”  and 
“measure”,  shows  the  discipline’s  origin  in  practical  concerns.  But 
geometry  also  became  the  theoretical  system  of  Euclid.  It  was  long  assumed 
that  this  system  described  the  world  correctly,  and  Kant’s  view  that  our 
minds  must  perceive  the  world  that  way  put  a sophisticated  seal  on  the  idea.6 
But  failures  to  prove  Euclid’s  parallel  postulate  finally  moved 
mathematicians  to  realize  that  it  could  be  replaced  by  another  assumption, 
and  thus  to  develop  non-Euclidean  geometries.7  Further  progress  resulted  in 
the  Riemannian  differential  geometry  that  Einstein  used  in  his  gravitational 
theory.  He  did  not  impose  a global  geometry  a priori  but  made  the  local 
character  of  space-time  something  to  be  determined  by  physical 
measurements.  Geometry  and  physics  were  united,  an  achievement  that 
Weyl  symbolized  in  the  equation  “Pythagoras  + Newton  = Einstein.”8 

General  relativity  uses  a metric  geometry:  The  local  properties  of 
space-time  are  completely  specified  by  the  metric  tensor  gov.  It  is,  however, 
possible  to  consider  more  general  differential  geometries,  as  Weyl  did 
within  three  years  of  the  introduction  of  Einstein’s  theory  in  an  attempt  to 
include  electromagnetism.9  Other  generalizations  then  followed.10 

Attempts  to  extend  Einstein’s  unification  of  physics  and  geometry 
are  viewed  most  helpfully  in  the  spirit  of  Klein’s  Erlanger  Programme 
Here  different  geometries  are  considered  in  terms  of  the  transformation 
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groups  that  they  allow.  We  can  begin  with  a topological  space  whose 
meaningful  properties  are  invariant  under  all  continuous  transformations, 
and  then  specialize  by  limiting  this  group,  allowing  new  properties  to 
emerge.  The  concepts  of  lines  and  points  as  their  intersections  are  invariant 
under  projective  transformations  and  are  therefore  meaningful  in  projective 
geometry.  A structure  of  parallelism  can  be  added  to  yield  affine  geometry, 
and  then  metric  properties  can  be  introduced. 

By  assuming  a metric  geometry  of  space-time  we  add  concepts  and 
specialize.  We  need  not  use  a geometry  more  complex  than  necessary  for 
physics,  but  insistence  that  the  geometry  of  the  world  must  be  Riemannian 
is  in  the  same  spirit  as  the  old  idea  that  the  geometry  of  the  world  must  be 
Euclid’s. 

These  arguments  will  appeal  to  those  who  believe  that  Einstein’s 
geometric  view  of  gravitation  was  fundamental.  Physicists  who  see  it  as  just 
one  way  of  interpreting  his  field  equations  will  question  the  value  of,  for 
example,  allowing  torsion,  a skew-symmetric  part  of  the  affine  connection. 
This  seems  to  be,  as  Weinberg  argues,  “just  a tensor,”  just  one  more  field.12 
But  in  generalizing  the  geometry  of  Einstein’s  gravitational  theory  we  are 
not  adding  torsion  but  removing  conditions  on  the  geometry  that  imply  the 
vanishing  of  torsion. 

The  fundamental  question  for  physics,  however,  is  whether  a 
geometric  approach  will  facilitate  our  understanding  of  our  observations  of 
the  world.  When  we  consider  that  possibility  we  will  see  some  of  the 
concrete  advantages  offered  by  an  affine  theory. 

Schrodinger’s  Affine  Theory 

The  basic  assumption  of  this  theory  is  a four-dimensional  manifold 
with  an  affine  connection  T^v  which  is  not  assumed  to  be  symmetric  and 

which  gives  the  change  in  a vector  A tt  when  displaced  by  a coordinate 
increment  dxv:  8 A A = -r“vAA dxv . 

In  the  usual  way,  the  covariant  derivative  of  a vector  field  A A can  be 
defined  as  Af'a  = AMa  + T^aAA  and  a curvature  tensor  can  be  constructed  as 

shown  by  Equation  (1): 
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dP  - pA  _ p/2  i p//  p«  _ p//  p« 
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(i) 


This  has  two  independent  contractions.  The  first,  given  by  Equation  (2), 


Ra*  = Kp 


r7' 

CJV,fl 


rM 

op,v 


+ va  — 

' 1 apl  o\’  A av 1 op  '■ 


(2) 


is  a generalization  of  the  Ricci  tensor  of  Riemannian  geometry.  The  second, 
given  by  Equation  (3): 

o = bu  =rw  -r;/  (3) 

pov  poy  pv,o">  V ' 


which  vanishes  in  a Riemannian  space,  is  completely  antisymmetric. 

Metric  concepts  have  not  yet  been  introduced  and  there  is  no  way  to 
compare  lengths  along  different  curves  or  to  measure  angles.  We  know, 
however,  that  at  some  scales  the  concepts  of  length  and  time  are  meaningful 
in  our  universe.  In  order  to  represent  them  we  need  to  have  a metric  tensor, 
a symmetric  second  rank  tensor  gm/ , to  define  a magnitude  ds  associated 

with  a displacement  dx A via  the  generalization  of  the  Pythagorean  theorem 

ds 2 = gfJvdxfJdxv . 


If  g is  not  to  be  a foreign  body  within  the  theory  then  we  must  use  a 

symmetric  second  rank  tensor  that  has  already  been  defined,  and  the  only 
possibility  at  this  stage  is  the  symmetric  part  of  Rm,.  Thus  gm/  must  be 

proportional  to  R(av).  We  can  write  this  relationship  more  suggestively  as 

Equation  (4): 

^«n.)  = Agov  W 

with  A a constant.  If  ds  is  to  have  the  same  dimension  as  dx  , dimensional 

analysis  with  powers  of  dxA  shows  that  R(av)  has  dimension  -2  so  that  A 

must  also  have  dimension  -2.  In  metric  language  it  must  have  the 
dimensions  of  an  inverse  length  squared. 

The  similarity  between  Eq.  (4)  and  Einstein’s  vacuum  equations 
with  a cosmological  term  can  hardly  be  missed.  (The  symbol  A was  not 
chosen  at  random.)  To  see  the  significance  of  this  more  fully,  we  need  to 
look  briefly  at  Schrodinger’s  formulation  of  what  he  called  “the  final  affine 
field  laws.” 
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To  obtain  dynamical  equations  for  the  field  variables  T^v  we  use 

Hamilton’s  principle  with  an  appropriate  Lagrange  density.  At  this  point  we 
have  no  way  to  raise  and  lower  indices,  so  the  simplest  density  can  be 
written  as  Eq.  (5): 


L = (2  / A) 


(5) 


where  A is  a constant  which  will  give  the  action  the  proper  dimensions.  (It 
is  also  possible  to  use  R + (l/4)£?  in  Eq.  (5)  to  achieve  projective 

invariance.)13 

Following  Schrodinger,  we  define  the  tensor  density  g"1  by 


9L  / 9R„„  = g"1' 


(6) 


and  form  the  covariant  and  contravariant  tensors  g and  gftv  associated 

with  it.  The  Euler-Lagrange  equations  of  the  variational  principle  then  give 
Eq.  (7): 

g^-*r^g«v-*r^,9^  = °.  (7) 


where  *r"  , Schrodinger’s  “star  affinity”,  is  an  abbreviation  for 


C + (2  / 3)S“Yp  and  Y „ is  (1/2) 


PP 

torsion. 


T^a  T^a 

pa  ap 


a contraction  of  the 


The  defining  Eq.  (5)  is  equivalent  to  Eq.  (8) 


so  that  Eq.  (6)  can  be  written  as  an  equation  involving  only  the  affinity  and 
the  contracted  curvature  tensor.  In  the  limiting  case  in  which  is 
symmetric,  Eq.  (6)  is  the  equation  satisfied  by  the  metric  tensor  in 
Riemannian  geometry.  It  can  be  solved  to  give  the  as  functions  of  g 

and  its  derivatives,  the  Christoffel  brackets.  R is  then  the  symmetric  Ricci 

tensor  of  general  relativity,  and  Eq.  (4)  and  Eq.  (7)  are  seen  to  be  identical, 
the  vacuum  Einstein  equations  with  a cosmological  term. 
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The  Demise  of  Metric 


The  metric  concept  is  meaningful  in  our  universe  only  at  sufficiently 
large  scales.  The  uncertainty  principle  of  quantum  theory  and  the  effect  of 
gravitation  on  clock  rates,  together  with  limited  resolution  of  a clock  due  to 
its  size,  show  that  below  certain  limits  time  intervals  cannot  be  measured.14 


To  measure  a time  interval  we  must  use  a clock.  The  time-energy 
uncertainty  relation  tells  us  that  the  time  t taken  for  the  measurement  and 
the  uncertainty  in  the  clock’s  energy  must  satisfy  tAE  > fi . On  the  other 
hand,  the  gravitational  field  of  the  clock’s  energy  of  a distance  R will  result 
in  a change  in  a time  interval  t given  by  St  / 1 ~ GE  / c4R  , where  E includes 
both  the  original  rest  energy  of  the  clock  and  AE.  t can  be  no  larger  than  R/c 
if  the  parts  of  the  clock  are  to  communicate  with  one  another.  When  these 
results  are  combined  we  find  that  St  > GJi  / c5t . Since  the  measurement  is 


of  no  value  unless  St  <t , t must  be  greater  than  T* 


TiG/c 5 


~ 1 0”43  s 


the  Planck  time.  The  corresponding  limit  for  spatial  dimensions,  the  Planck 
length,  is  f * = cf*  ~ 1 0-35  m . 


Planck  noted  when  he  introduced  the  constant  h that  with  c and  G it 
implied  natural  units  of  length,  time  and  mass,15  but  the  significance  of  these 
units  seems  not  to  have  been  given  much  consideration  in  the  period  when 
Schrodinger  was  developing  his  theory.  As  the  previous  paragraph  shows, 
T*  and  L*  are  not  just  results  of  dimensional  analysis  but  basic  limits  on 
measurement  of  space-time  intervals.  Metric  concepts  are  valid  only  for 
sufficiently  large  regions,  a possibility  already  foreseen  by  Riemann.16 

It  has  been  suggested  by  some  authors  that  various  types  of  discrete 
structure  for  space-time  should  be  considered  at  small  scales.17  But  there  are 
problems  with  abandoning  the  continuum  concept  and  giving  space-time  an 
atomistic  character.  The  full  group  of  coordinate  transformations  of  the 
continuum  cannot  be  adequately  approximated  in  a discrete  model  of  space- 
time.18  It  seems  wise  to  take  a more  conservative  approach  and  consider 
space-time  as  a continuous  manifold  in  which  the  metric  concept  breaks 
down  at  sufficiently  small  scales. 

One  way  to  understand  the  failure  of  space-time’s  metrical  character 
in  the  context  of  affine  theory  begins  by  noting  that  a metric  tensor  defined 
by  Eq.  (4)  depends  on  the  first  derivatives  of  the  connection.  (This  reverses 
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the  relationship  of  Riemannian  geometry,  in  which  the  Christoffel  brackets 
involve  derivatives  of  the  metric.)  If  T^v  were  continuous  but  not 

differentiable  in  a region  then  gm.  would  not  be  defined  there.  The  affine 

connection  in  those  regions  could  not  obey  differential  equations  and  would 
have  a fractal  structure.  19  One  consequence  of  this  would  be  scale 
invariance:  No  constant  with  the  dimensions  of  length  would  be  involved. 

It  is  not  difficult  to  find  functions  that  are  everywhere  continuous 
but  differentiable  nowhere,  or  only  on  a set  of  points  of  measure  zero,  in  a 
domain:  Weierstrass  first  published  an  example  of  such  a function,  the 

Fourier  series  / (/)  = ^Jb"  sin  ycin7rt^ , where  n ranges  from  1 to  go  and  ab  > 

1 + 3ti/2.20  If  the  connection  coefficients  were  represented  by  such  functions 
then  their  behavior  in  their  function  space  would  be  somewhat  like 
Brownian  motion  or  turbulence. 

Suppose  that  this  were  the  case  for  sufficiently  small  regions  of 
space-time.  (“Small”  can  only  be  defined  from  the  outside,  since  there  is 
no  metric  inside  these  regions.)  The  metric  concept  would  break  down  if 
attempts  were  made  to  explore  such  regions,  and  we  have  already  seen  that 
that  is  actually  the  case  for  time  and  length  scales  below  the  Planck  values. 
On  the  other  hand,  scale  invariance  would  be  broken  by  the  presence  of  the 

length  |A|  ' “ . We  will  consider  question  of  the  spatio-temporal  scale  at 

which  the  metric  concept  fails  when  we  discuss  cosmology  in  the  next 
section. 

Palmer  has  proposed  a new  approach  to  quantum  theory  based  on 
the  ideas  “that  states  of  physical  reality  belong  to,  and  are  governed  by,  a 
non-computable  fractal  subset  I of  state  space,  invariant  under  the  action  of 
some  subordinate  deterministic  causal  dynamics  D”  and  that  “gravity  plays 
a key  role  in  generating  the  fractal  geometry  off”21  An  affine  theory  seems 
to  have  the  potential  to  provide  such  a geometry.  But  whether  that  proposal 
is  pursued,  it  seems  clear  that  an  affine  approach  can  ensure  that  one  of  the 
basic  requirements  for  an  adequate  theory  of  quantum  gravity  is  satisfied. 
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The  Necessity  of  a Cosmological  Term 

The  way  in  which  we  have  led  up  to  Eq.  (8)  brings  out  the  inevitable 
appearance  of  the  cosmological  term.  It  is  instructive  to  proceed  this  way 
because  that  term  has  been  the  object  of  controversy.  Einstein  added  this 
term  to  his  original  equations  to  make  a static  universe  possible  and  then 
dropped  it  when  it  was  found  that  the  universe  is  expanding.  Between 
Hubble’s  discovery  of  cosmic  expansion  in  1929  and  the  realization  in  the 
late  1990s  that  this  expansion  was  indeed  speeding  up,  many  workers  in 
general  relativity  and  cosmology  dismissed  the  cosmological  term. 

A number  of  theoretical  models  of  dark  energy  have  been  proposed 
to  account  for  the  acceleration  of  cosmic  expansion,  but  present  data  are 
compatible  with  Einstein's  1917  cosmological  term.22  The  simplest  solution 
of  Eq.  (8)  is  the  well-known  de  Sitter  metric  which  can  be  written  as  Eq. 
(9): 

ds2  =-dt2  + exp(2Ht)dl2 , (9) 

where  c = 1 now  (so  that  length  and  time  scales  are  identical),  dl  is  the 

Euclidean  spatial  line  element  and  H = (A/3)  “ is  the  Hubble  constant. 

The  fact  that  the  purely  affine  theory  requires  a cosmological  term  must 
count  in  its  favor  as  prediction  of  a “novel  fact,”  one  that  was  not  assumed 
in  the  formulation  of  the  theory.  Most  other  dark  energy  models  have  been 
introduced  precisely  to  explain  cosmic  acceleration  and  cannot  therefore 
count  it  as  a prediction. 

Quantum  theory,  however,  presents  the  problem  of  a cosmological 
constant  that  is  far  too  large.  The  cosmological  term  in  the  gravitational 
field  equations  has  the  form  of  a stress-energy  tensor  for  a fluid  whose 
pressure  is  negative  and  equal  in  magnitude  to  its  energy  density,  and  the 
vacuum  energy  of  quantum  fields  has  just  this  form.  This  effective 
cosmological  constant  obtained  from  quantum  field  theory  cannot  be 
reconciled  with  observations.  Zel’dovich’s23  calculation  of  the  vacuum 
stress-energy  tensor  for  an  assortment  of  boson  and  fermion  fields  with  a 
cutoff  on  the  order  of  the  proton’s  Compton  wavelength  gave  a value  44 
orders  of  magnitude  larger  than  what  observations  at  that  time  would  allow. 

A more  fundamental  cutoff  is  defined  by  the  Planck  length. 
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The  resulting  vacuum  energy  in  Einstein’s  equations  gives  a model 
universe  that,  according  to  Eq.  (9),  expands  exponentially  with  a 
characteristic  time  on  the  order  of  T*  ~ 10  43  s.  Of  course  this  is 
unacceptable. 

But  things  could  be  different  when  we  consider  this  vacuum  energy 
in  the  context  of  affine  theory.  The  vacuum  energy  density  p gives  rise  to 
an  effective  cosmological  constant  87 ip,  and  if  we  were  combine  this  with 
A to  give  an  effective  cosmological  constant  we  get  Eq.  (10): 


A'  = A + 8 np 

(10) 

then  we  could  replace  Eq.  (8)  with  Eq.  (11): 

R0V=A'SoV 

(11) 

We  could  then  choose  the  value  of  A to  “renormalize”  the  effective 
cosmological  constant  to  a value  A'  in  accord  with  observations,  as 
Shifflett  has  suggested.24  Since  p has  a large  positive  value,  A would  have 
to  have  a negative  value  of  nearly  the  same  magnitude. 

Choosing  such  a value  for  A is  not  completely  arbitrary.  The 

cosmological  constant  defines  a fundamental  length  and  time  |A|  ' ~ which 

would  break  the  scale  invariance  of  the  fractal  regime  discussed  in  the 
previous  section.  If  that  is  close  to  the  cutoff  for  calculation  of  vacuum 
energy  then  A and  A'  would  be  of  the  same  order  of  magnitude. 

Affine  theory  has  a universal  standard  of  length,  |A|  1 “ . In  the  1930s 

Eddington  gave  this  as  a reason  to  retain  “the  cosmical  constant”  even  after 
the  original  motive  for  it  had  disappeared.25  It  seemed  obvious  then  that  this 
length  would  be  of  cosmological  size.  The  idea  that  there  are  two 
fundamental  lengths,  one  provided  by  A and  the  other  defined  by  the 
fundamental  constants  h,c,  and  G,  might  have  raised  suspicions  if  the 
physical  significance  of  the  Planck  length  had  been  given  more  attention  at 
that  time.  We  can  see  now  the  possibility  that  the  two  fundamental  lengths 
are  approximately  the  same. 
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Prospects  for  Further  Progress 

Eqs.  (7)  and  (8)  are  the  field  equations  of  Schrodinger’s  theory, 
which  reduce  to  Einstein’s  vacuum  equations  with  a cosmological  term  in 
the  limiting  case  of  a symmetric  connection.  Since  we  are  not  pursuing  a 
unified  field  theory  we  can  describe  other  fields  and  particles  by  introducing 

non-geometric  variables  <f>A  and  a Lagrange  density  Lm  which  will  depend 
on  the  ®A,  their  derivatives  and  g which,  in  turn,  is  a function  of  T^v  and 

its  derivatives  via  Eq.  (8).26  If  Lm  is  added  to  Eq.  (5)  and  the  procedures  of 
Hamilton’s  principle  are  earned  through  then  the  full  gravitational  field 
equations  with  an  energy-momentum  tensor  defined  in  the  standard  way  can 
be  obtained  if  that  tensor  is  small  in  comparison  with  the  cosmological  term. 

The  energy  densities  of  baryonic  and  dark  matter  are  much  smaller 
than  the  magnitude  of  the  A that  we  have  hypothesized  to  renormalize 
quantum  vacuum  energy,  so  this  approximation  makes  sense  for  those 
forms  of  matter.  However,  the  vacuum  energy  itself  is  comparable  in 
magnitude  with  A.  So  while  this  approximation  method  has  some  value,  it 
does  not  enable  us  to  shed  any  light  on  features  of  quantum  field  theory  such 
as  ultraviolet  divergences. 

We  have  taken  advantage  of  the  possibility  of  a connection  that  is 
not  differentiable  to  suggest  that  the  metric  concept  might  break  down 
below  some  space-time  scale,  and  suggest  that  this  could  be  correlated  with 
the  implication  of  the  uncertainty  principle  and  the  gravitational  effect  on 
clock  rates  that  intervals  below  the  Planck  scale  cannot  be  measured. 
However,  this  would  also  mean  that  the  Ricci  tensor,  which  is  used  to  form 
the  Lagrangian,  involves  derivatives  of  the  connection  and  would  not  be 
defined.  The  classical  Hamilton’s  principle  could  no  longer  be  used  to 
derive  field  equations.  This  is  not  surprising  if  the  metric  concept  indeed 
fails  below  the  Planck  scale. 

One  way  to  proceed  would  be  to  look  for  an  algebraic  expression 
which  approximates  the  classical  action  Eq.  (5)  on  scales  for  which  the  latter 
is  meaningful.  We  could  then  use  this  action  in  Feynman’s  “sum  over 
histories”  approach  to  quantum  theory  in  order  to  explore  the  implications 
of  the  affine  theory  further. 
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For  now  we  have  barely  a hint  of  ways  in  which  an  adequate 
quantum  theory  of  matter  and  non-gravitational  phenomena  might  be 
developed  on  the  basis  of  affine  geometry.  This  approach  does,  however, 
seem  capable  of  showing  in  a natural  way  how  the  metric  concept  may  fail 
below  the  Planck  scale  without  abandoning  the  continuum  concept.  In 
addition,  the  requirement  that  the  cosmological  constant  not  vanish  and  the 
possibility  that  the  theory  can  deal  with  the  problem  of  a huge  quantum 
vacuum  energy  means  that  the  observed  cosmic  acceleration  can  be 
explained  in  an  unforced  manner.  The  long-dormant  theory  that 
Schrodinger  proposed  in  the  1940s  seems  today  to  have  some  promise. 
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32034  (LF) 

JONG,  SHUNG-CHANG  (Dr.)  8892  Whitechurch  Ct,  Bristow  VA  20136  (LF) 
JORDANA,  ROMAN  DE  VICENTE  (Dr.)  Batalla  De  Garellano,  15,  Aravaca, 

28023,  Madrid,  Spain  (EF) 

KADTKE,  JAMES  (Dr.)  Apt.  824,  1701  16th  St.  NW,  Washington  DC  20009-3131 
(M) 

KAHN,  ROBERT  E.  (Dr.)  909  Lynton  Place,  Mclean  VA  22102  (F) 
KAPETANAKOS,  C. A.  (Dr.)  443 1 MacArthur  Blvd,  Washington  DC  20007  (EF) 
KATZ,  ROBERT  (Dr.)  3310  N.  Leisure  Blvd  #530,  Silver  Spring  MD  20906  (EF) 


Winter  2015 


48 


KAUFHOLD,  JOHN  (Dr.)  Suite  1200,  4601  N.  Fairfax  Dr,  Arlington  VA  22203 
(M) 

KAY,  PEG  (Ms.)  6111  Wooten  Drive,  Falls  Church  VA  22044  (LF) 

KEEFER,  LARRY  (Dr.)  7016  River  Road,  Bethesda  MD  20817  (F) 

KE1SER,  BERNHARD  E.  (Dr.)  2046  Carrhill  Road,  Vienna  VA  22181-2917  (LF) 
KLINGSBERG,  CYRUS  (Dr.)  Apt.  LI 84,  500  E.  Marylyn  Ave,  State  College  PA 
16801-6225  (EF) 

KLOPFENSTEIN,  REX  C.  (Mr.)  4224  Worcester  Dr.,  Fairfax  VA  22032-1 140  (LF) 
KRUEGER,  GERALD  P.  (Dr.)  Krueger  Ergonomics  Consultants,  4105  Komes 
Court,  Alexandria  VA  22306-1252  (EM) 

KUO,  CHUN-HUNG  (Dr.)  4637  Knight  Place,  Alexandria  VA  223 1 1 (M) 
LAWSON,  ROGER  H.  (Dr.)  10613  Steamboat  Landing,  Columbia  MD  21044  (EF) 
LEDGER,  SAM  (Mr.)  420  7th  Street  NW,  Apt  903,  Washington  DC  20004  (M) 
LE1BOWITZ,  LAWRENCE  M.  (Dr.)  3903  Laro  Court,  Fairfax  VA  2203 1 (LF) 
LEMKIN,  PETER  (Dr.)  148  Keeneland  Circle,  North  Potomac  MD  20878  (EM) 
LESHUK,  RICHARD  (Mr.)  9004  Paddock  Lane,  Potomac  MD  20854  (M) 

LEWIS,  DAVID  C.  (Dr.)  27  Bolling  Circle,  Palmyra  VA  22963  (F) 

LEWIS,  E.  NEIL  (Dr.)  Malvern  Instruments,  Suite  300,  7221  Lee  Deforest  Dr, 
Columbia  MD  21046  (M) 

LIBELO,  LOUIS  F.  (Dr.)  9413  Bulls  Run  Parkway,  Bethesda  MD  20817  (LF) 
LONDON,  MARILYN  (Ms.)  3520  Nimitz  Rd,  Kensington  MD  20895  (F) 
LONGSTRETH,  III,  WALLACE  I (Mr.)  8709  Humming  Bird  Court,  Laurel  MD 
207231254  (M) 

LOOMIS,  TOM  H.  W.  (Mr.)  11502  Allview  Dr.,  Beltsville  MD  20705  (EM) 

LUTZ,  ROBERT  J.  (Dr.)  6031  Willow  Glen  Dr,  Wilminton  NC  28412  (EF) 

LYON,  HARRY  B.  (Mr.)  7722  Northdown  Road,  Alexandria  VA  22308-1329  (M) 
LYONS,  JOHN  W.  (Dr.)  7430  Woodville  Road,  Mt.  Airy  MD  21771  (EF) 
MALCOM,  SHIRLEY  M.  (Dr.)  12901  Wexford  Park,  Clarksville  MD  21029-1401 
(F) 

MANDERSCHEID,  RONALD  W.  (Dr.)  10837  Admirals  Way,  Potomac  MD  20854- 
1232  (LF) 

MASON,  LANCE  (Dr.)  1212  Calla  Cerrito,  Santa  Barbara  CA  93101  (M) 
MCFADDEN,  GEOFFREY  B (Dr.)  20117  Darlington  Drive,  Montgomery  Village 
MD  20886  (M) 

MENZER,  ROBERT  E.  (Dr.)  90  Highpoint  Dr,  Gulf  Breeze  FL  32561-4014  (EF) 
MESSINA,  CARLA  G.  (Mrs.)  9800  Marquette  Drive,  Bethesda  MD  20817  (F) 
METAILIE,  GEORGES  C.  (Dr.)  18  Rue  Liancourt,  75014  Paris,  FRANCE  (F) 
MIELCZAREK,  EUGENIE  A.  (Dr.)  3181  Readsborough  Ct,  Fairfax  VA  2203 1 - 
2625  (F) 

MILLER,  JAY  H.  (Mr.)  8924  Ridge  Place,  Bethesda  MD  20817-3364  (M) 

MILLER  II,  ROBERT  D.  (Dr.)  The  Catholic  University  of  America,  10918  Dresden 
Drive,  Beltsville  MD  20705  (M) 

MORGOUNOV,  ALEXEY  (Dr.)  CIMMYT,  P.K.  39,  Emek,  Ankara  0651  1,  Turkey 
(M) 

MORRIS,  JOSEPH  (Mr.)  PO  Box  3005,  Oakton  VA  22124-9005  (M) 


Washington  Academy  of  Sciences 


49 


MORRIS,  P.E.,  ALAN  (Dr.)  4550  N.  Park  Ave.  #104,  Chevy  Chase  MD  20815  (EF) 
MOUNTAIN,  RAYMOND  D.  (Dr.)  701  King  Farm  Blvd  #327,  Rockville  MD 
20850  (F) 

MUMMA,  MICF1AEL  J.  (Dr.)  210  Glen  Oban  Drive,  Arnold  MD  21012  (F) 
MURDOCH,  WALLACE  P.  (Dr.)  65  Magaw  Avenue,  Carlisle  PA  17015  (EF) 
NORRIS,  KARL  H.  (Mr.)  1 1204  Montgomery  Road,  Beltsville  MD  20705  (EF) 
O'HARE,  JOHN  J.  (Dr.)  108  Rutland  Blvd,  West  Palm  Beach  FL  33405-5057  (EF) 
OHRINGER,  LEE  (Mr.)  5014  Rodman  Road,  Bethesda  MD  20816  (EF) 

ORDWAY,  FRED  (Dr.)  5205  Elsmere  Avenue,  Bethesda  MD  20814-5732  (EF) 
OTT,  WILLIAM  R (Dr.)  19125  N.  Pike  Creek  Place,  Montgomery  Village  MD 
20886  (EF) 

PAJER,  BERNADETTE  (Mrs.)  25116  143rd  St.  SE,  Monroe  WA  98272  (M) 

PARIS,  ANTONIO  (Mr.)  25066  12th  Ave  N,  St  Petersburg  FL  33713-5510  (M) 
PARR,  ALBERT  C (Dr.)  2656  SW  Eastwood  Avenue,  Gresham  OR  97080-9477  (F) 
PATEL,  D.  G.  (Dr.)  1 1403  Crownwood  Lane,  Rockville  MD  20850  (F) 

PAULONIS,  JOHN  (Mr.)  Yonkers  NY  10710  (M) 

PAULONIS,  JOHN  J (Mr.)  P.O.  Box  335,  Yonkers  NY  10710  (M) 

PAZ,  ELVIRA  L.  (Dr.)  172  Cook  Hill  Road,  Wallingford  CT  06492  (ELF) 
PICKFJOLTZ,  RAYMOND  L.  (Dr.)  3613  Glenbrook  Road,  Fairfax  VA  22031-3210 
(EF) 

PLESCIA,  JEFFREY  (Dr.)  Applied  Physics  Laboratory,  The  Johns  Hopkins 

University,  MS  200-W230  11100  Johns  Hopkins  Road,  Laurel  MD  20723-6099 
(M) 

POLAVARAPU,  MURTY  10416  Hunter  Ridge  Dr.,  Oakton  VA  22124  (LF) 
POLINSKi,  ROMUALD  (Mr.)  01-201  WARSZAWA,  UL.,  Wolska  43,  Poland  (M) 
PRZYTYCKI,  JOZEF  M.  (Prof.)  10005  Broad  St,  Bethesda  MD  20814  (F) 

PYKE,  JR,  THOMAS  N.  (Mr.)  4887  N.  35th  Road,  Arlington  VA  22207  (F) 
RADER,  CHARLES  A.  (Mr.)  1101  Paca  Drive,  Edgewater  MD  21037  (EF) 
RAMAKER,  DAVID  E.  (Dr.)  6943  Essex  Avenue,  Springfield  VA  22150  (EF) 
RAVITSKY,  CHARLES  (Mr.)  37129  Village  37,  Camarillo  CA  93012  (EF) 
READER,  JOSEPH  (Dr.)  National  Institute  of  Standards  and  Technology,  100 
Bureau  Drive,  MS  8422,  Gaithersburg  MD  20899-8422  (F) 

REAGAN,  ANN  M.  (Dr.)  PO  Box  22,  Lusby  Maryland  20657  (M) 

REDISH,  EDWARD  F.  (Prof.)  6820  Winterberry  Lane,  Bethesda  MD  20817  (F) 
REISCHAUER,  ROBERT  (Dr.)  5509  Mohican  Rd,  Bethesda  MD  20816  (EF) 
RENAUD,  PHILIP  (Capt.)  Living  Oceans  Foundation,  8181  Professional  Place  Suite 
215,  Landover  MD  20785  (M) 

RICH,  PAUL  (Dr.)  1527  New  Hampshire  Avenue,  NW,  Washington  DC  20036  (M) 
RICKER,  RICHARD  (Dr.)  12809  Talley  Ln,  Darnestown  MD  20878-6108  (F) 
RIDGELL,  MARY  P.O.  Box  133,  48073  Mattapany  Road,  St.  Mary's  City  MD 
20686-0133  (LM) 

ROBERTS,  SUSAN  (Dr.)  Ocean  Studies  Board,  Keck  607,  National  Research 
Council,  500  Fifth  Street,  NW,  Washington  DC  20001  (F) 

ROGERS,  KENNETH  (Dr.)  355  Fellowship  Circle,  Gaithersburg  MD  20877  (LM) 
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ROMAN,  NANCY  GRACE  (Dr.)  4620  North  Park  Avenue  Apt.  306W,  Chevy 
Chase  MD  20815  (F) 

ROOD,  SALLY  A (Dr.)  PO  Box  12093,  Arlington  VA  22219  (F) 

ROSENBLATT,  JOAN  R.  (Dr.)  701  King  Farm  Blvd,  Apt  630,  Rockville  MD  20850 
(EF) 

SAUBERMAN,  P.E.,  HARRY  R (Mr.)  8810  Sandy  Ridge  Ct.,  Fairfax  VA  22031 
(M) 

SCHINDLER,  ALBERT  I.  (Dr.)  6615  Sulky  Lane,  Rockville  MD  20852  (EF) 
SCHMEIDLER,  NEAL  F.  (Mr.)  7218  Hadlow  Drive,  Springfield  VA  22152  (F) 
SCHNEPFE,  MARIAN  M.  (Dr.)  Potomac  Towers,  Apt.  640,  2001  N.  Adams  Street, 
Arlington  VA  22201  (EF) 

SEVERINSKY,  ALEX  J.  (Dr.)  4707  Foxhall  Cres  NW,  Washington  DC  20007-1064 
(EM) 

SHAFRIN,  ELAINE  G.  (Mrs.)  8168  Connecticut  Ave  NW  Apt  2014,  Washington 
DC  20815-817  (EF) 

SHETLER,  STANWYN  G.  (Dr.)  142  E Meadowland  Ln,  Sterling  VA  20164-1144 
(EF) 

SHIELDS,  EDWARD  (Dr.)  PO  Box  165,  Grand  Portage  Minnesota  55605  (M) 
SHROPSHIRE,  JR,  W.  (Dr.)  Apt.  426,  300  Westminster  Canterbury  Dr.,  Winchester 
VA  22603  (LF) 

SILBER,  RONNIE  (Dr.)  13710  Colgate  Way  #1338,  Silver  Spring  MD  20904  (M) 
SIMMS,  JAMES  ROBERT  (Mr.)  9405  Elizabeth  Ct.,  Fulton  MD  20759  (M) 

SMITH,  THOMAS  E.  (Dr.)  3148  Gracefield  Rd  Apt  215,  Silver  Spring  MD  20904- 
5863  (LF) 

SNIECKUS,  MARY  (Ms.)  1700,  Dublin  Dr.,  Silver  Spring  MD  20902  (M) 
SODERBERG,  DAVID  L.  (Mr.)  403  West  Side  Dr.  Apt.  102,  Gaithersburg  MD 
20878  (M) 

SOLAND,  RICHARD  M.  (Dr.)  2516  Arizona  Av  Apt  6,  Santa  Monica  CA  90404- 
1426  (LF) 

SOZER,  AMANDA  (Dr.)  4707B  Eisenhower  Ave,  Alexandria  VA  22304  (M) 
SPILHAUS,  JR,  A.F.  (Dr.)  10900  Picasso  Lane,  Potomac  MD  20854  (EM) 

SRIRAM,  RAM  DUVVURU  (Dr.)  1030  Castlefield  Street,  Ellicott  City  MD  21042 
(LF) 

STEIN,  DAVID  E.  (Mr.)  P.O.  Box  571433,  Las  Vegas  NV  89157  (M) 

STERN,  KURT  H.  (Dr.)  103  Grant  Avenue,  Takoma  Park  MD  20912-4328  (EF) 
STEWART,  PETER  (Prof.)  417  7th  Street  NE,  Washington  DC  20002  (M) 

STIEF,  LOUIS  J.  (Dr.)  332  N St.,  SW.,  Washington  DC  20024-2904  (EF) 
STOMBLER,  ROBIN  (Ms.)  Auburn  Health  Strategies,  3519  South  Four  Mile  Run 
Dr.,  Arlington  VA  22206  (M) 

STRAUSS,  SIMON  W.  (Dr.)  3160  Bracefield  Rd.  RC  1305,  Silver  Spring  MD 
20904  (LF) 

SYED,  ALI  (Dr)  603,  H Street  SW,  Washington  DC  20024  (M) 

TABOR,  HERBERT  (Dr.)  NIDDK,  LBP,  Bldg  8,  Rm  223,  National  Institutes  of 
Health,  Bethesda  MD  20892-0830  (M) 
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TAYSING-LARA,  MONICA  (Ms.)  3343  Deni  Place  NW,  Washington  DC  20007 
(M) 

TEICH,  ALBERT  H.  (Dr.)  PO  Box  309,  Garrett  Park  MD  20896  (EF) 
THEOFANOS,  MARY  FRANCES  (Ms.)  7241  Antares  Drive,  Gaithersburg  MD 
20879  (M) 

THOMPSON,  DENNIS  (Mr.)  325  Sandy  Ridge  Rd,  Fredricksburg  VA  22405  (M) 
THOMPSON,  CHRISTIAN  F.  (Dr.)  278  Palm  Island  Way,  Ponte  Vedra  FL  32081 
(LF) 

TIMASHEV,  SVIATOSLAV  (SLAVA)  A.  (Dr.  Prof.)  3306  Potterton  Dr.,  Falls 
Church  V A 22044-1603  (F) 

TOUWAIDE,  ALAIN  PO  Box  25499,  Washington  DC  20027  (LF) 

TOWNSEND,  MARJORIE  R.  (Mrs.)  3529  Tilden  Street,  NW,  Washington  DC 
20008-3194  (LF) 

TROXLER,  G.W.  (Dr.)  PO  Box  1 144,  Chincoteague  VA  23336-9144  (F) 

TYLER,  PAUL  E.  (Dr.)  1023  Rocky  Point  Ct.  N.E.,  Albuquerque  NM  87123-1944 
(EF) 

UBELAKER,  DOUGLAS  H.  (Dr.)  Dept,  of  Anthropology,  National  Museum  of 
Natural  History,  Smithsonian  Institution,  Washington  DC  20560-01 12  (F) 
UMPLEBY,  STUART  (Professor)  The  George  Washington  University,  Apt  1207 
4141  N Henderson  Rd,  Arlington  VA  22203  (F) 

VAISHNAV,  MARIANNE  P.  (Ms.)  P.O.  Box  2129,  Gaithersburg  MD  20879  (LF) 
VAN  TUYL,  ANDREW  (Dr.)  3618  Littledale  Road,  Apt.  203,  Kensington  MD 
20895-3434  (EF) 

VARADI,  PETER  F.  (Dr.)  Apartment  1606W,  4620  North  Park  Avenue,  Chevy 
Chase  MD  20815-7507  (EF) 

VAVRICK,  DANIEL  J.  (Dr.)  10314  Kupperton  Court,  Fredericksburg  VA  22408 
(F) 

VIZAS,  CHRISTOPHER  (Dr.)  504  East  Capitol  Street,  NE,  Washington  DC  20003 
(M) 

WALDMANN,  THOMAS  A.  (Dr.)  3910  Rickover  Road,  Silver  Spring  MD  20902 

(F) 

WALLER,  JOHN  D.  (Dr.)  5943  Kelley  Court,  Alexandria  VA  22312-3032  (M) 
WEAR,  DOUGLAS  (Dr.)  8014  Barron  Street,  Takoma  Park  MD  20912  (M) 
WEBB,  RALPH  E.  (Dr.)  21-P  Ridge  Road,  Greenbelt  MD  20770  (EF) 

WEIL,  TIMOTHY  (Mr.)  SECURITYFEEDS,  PO  Box  18385,  Denver  CO  80218 
(M) 

WEISS,  ARM  AND  B.  (Dr.)  6516  Truman  Lane,  Falls  Church  VA  22043  (LF) 
WERGIN,  WILLIAM  P.  (Dr.)  1 Arch  Place  #322,  Gaithersburg  MD  20878  (EF) 
WHITE,  CARTER  (Dr.)  12160  forest  Hill  Rd,  Waynesboro  PA  17268  (F) 

WIESE,  WOLFGANG  L.  (Dr.)  8229  Stone  Trail  Drive,  Bethesda  MD  20817  (EF) 
WILCOX,  ROBIN  J.  (Ms.)  8601  Park  Avenue,  Bowie  MD  20720  (M) 
WILLIAMS,  CARL  (Dr.)  2272  Dunster  Lane,  Potomac  MD  29854  (F) 
WILLIAMS,  E.  EUGENE  (Dr.)  Dept,  of  Biological  Sciences,  Salisbury  University, 
1101  Camden  Ave,  Salisbury  MD  21801  (M) 
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WITHERSPOON,  F.  DOUGLAS  ASTI,  11316  Smoke  Rise  Ct.,  Fairfax  Station  VA 
22039  (M) 

WU,  KLELI  (Mr.)  360,  Suite  48,  Swift  Ave,  South  San  Francisco  CA  94080  (M) 


Washington  Academy  of  Sciences 


53 


Washington  Academy  of  Sciences 
1200  New  York  Avenue 
Rm  113 

Washington,  DC  20005 

Please  fill  in  the  blanks  and  send  your  application  to  the  address  above.  We  will 
contact  you  as  soon  as  your  application  has  been  reviewed  by  the  Membership 
Committee.  Thank  you  for  your  interest  in  the  Washington  Academy  of  Sciences. 

(Dr.  Mrs.  Mr.  Ms) 


Business  Address 


Home  Address 


Email 


Phone  ===__ 

Cell  Phone  _^__________=======_===__^=__= 

preferred  mailing  address  Type  of  membership 


Business  Home  Regular  Student 


Schools  of  Higher  Education  attended 

Degrees 

Dates 

Present  Occupation  or  Professional  Position  

Please  list  memberships  in  scientific  societies  - include  office  held: 
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Instructions  to  Authors 

1.  Deadlines  for  quarterly  submissions  are: 

Spring  - February  1 Fall  - August  1 

Summer  - May  1 Winter  - November  1 

2.  Draft  Manuscripts  using  a word  processing  program  (such  as 
MSWord),  not  PDF.  We  do  not  accept  PDF  manuscripts. 

3.  Papers  should  be  6,000  words  or  fewer.  If  there  are  7 or  more  graphics, 
reduce  the  number  of  words  by  500  for  each  graphic. 

4.  Include  an  abstract  of  150-200  words. 

5.  Include  a two  to  three  sentence  bio  of  the  authors. 

6.  Graphics  must  be  in  graytone,  and  be  easily  resizable  by  the  editors  to 
fit  the  Journal’s  page  size.  Reference  the  graphic  in  the  text. 

7.  Use  endnotes  or  footnotes.  The  bibliography  may  be  in  a style 
considered  standard  for  the  discipline  or  professional  field  represented 
by  the  paper. 

8.  Submit  papers  as  email  attachments  to  the  editor  or  to 
wasioumal@washacadsci.org  . 

9.  Include  the  author’s  name,  affiliation,  and  contact  information  - 
including  postal  address.  Membership  in  an  Academy-affiliated  society 
may  also  be  noted.  It  is  not  required. 

10.  Manuscripts  are  peer  reviewed  and  become  the  property  of  the 
Washington  Academy  of  Sciences. 

1 1 . There  are  no  page  charges. 

12.  Manuscripts  can  be  accepted  by  any  of  the  Board  of  Discipline  Editors. 
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National  Institute  for  Standards  & Technology  (NIST) 

Meadowlark  Botanical  Gardens 

The  John  W.  Kluge  Center  of  the  Library  of  Congress 

Potomac  Overlook  Regional  Park 

Koshland  Science  Museum 

American  Registry  of  Pathology 

Living  Oceans  Foundation 

National  Rural  Electric  Cooperative  Association  (NRECA) 
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Delegates  to  the  Washington  Academy  of  Sciences 
Representing  Affiliated  Scientific  Societies 


Acoustical  Society  of  America,  Washington  Chapter 
American  Assoc,  of  Physics  Teachers,  Chesapeake  Section 
American  Astronomical  Society 
American  Ceramics  Society 
American  Fisheries  Society 

American  Institute  of  Aeronautics  and  Astronautics 
American  Meteorological  Society,  Washington,  DC  Chapter 
American  Nuclear  Society,  Washington  DC  Section 
American  Phytopathological  Society,  Potomac  Division 
American  Society  for  Cybernetics 
American  Society  for  Metals,  Washington  Chapter 
American  Society  of  Civil  Engineers,  National  Capital 
Section 

American  Society  of  Mechanical  Engineers,  Washington 
Section 

American  Society  of  Microbiology,  Washington  Branch 
American  Society  of  Plant  Biologists,  Mid-Atlantic 
Anthropological  Society  of  Washington 
ASM  International 

Association  for  Women  in  Science,  DC  Metropolitan  Chapter 
Association  for  Computing  Machinery,  DC  Area  Chapter 
Association  for  Science,  Technology,  and  Innovation 

Association  of  Information  Technology  Professionals 
Biological  Society  of  Washington 
Botanical  Society  of  Washington 
Capital  Area  Food  Protection  Association 
Chemical  Society  of  Washington 
District  of  Columbia  Institute  of  Chemists 
District  of  Columbia  Psychological  Association 
Eastern  Sociological  Society 

Electrochemical  Society,  National  Capital  Section 
Entomological  Society  of  Washington 
Geological  Society  of  Washington 

Flistorical  Society  of  Washington  DC 
Human  Factors  and  Ergonomics  Society,  Potomac  Chapter 
Institute  of  Electrical  and  Electronics  Engineers,  Northern 
Virginia  Section 

(continued  on  next  page) 


Paul  Arveson 
Frank  R.  Haig,  S.  J. 
Sethanne  Howard 
Vacant 
Lee  Benaka 
David  W.  Brandt 
Vacant 

Charles  Martin 
Vacant 

Stuart  Umpleby 

Vacant 

Vacant 

Daniel  J.  Vavrick 
Vacant 

Mark  Holland 
Vacant 

Toni  Marechaux 
Jodi  Wesemann 
Alan  Ford 
F.  Douglas 
Witherspoon 
Chuck  Lowe 
Stephen  Gardiner 
Chris  Puttock 
Keith  Lempel 
Elise  Ann  Brown 
Vacant 
Tony  Jimenez 
Ronald  W. 
Mandersheid 
Vacant 
Vacant 

Jeffrey  B.  Plescia 
Jurate  Landwehr 
Vacant 

Gerald  P.  Krueger 
Murty  Polavarapu 
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Institute  of  Electrical  and  Electronics  Engineers,  Washington 
Section 

Institute  of  Food  Technologies,  Washington  DC  Section 
Institute  of  Industrial  Engineers,  National  Capital  Chapter 
International  Association  for  Dental  Research,  American 
Section 

International  Society  for  the  Systems  Sciences 
International  Society  of  Automation,  Baltimore  Washington 
Section 

Instrument  Society  of  America 
Marine  Technology  Society 
Maryland  Native  Plant  Society 

Mathematical  Association  of  America,  Maryland-District  of 
Columbia- Virginia  Section 
Medical  Society  of  the  District  of  Columbia 
National  Capital  Area  Skeptics 
National  Capital  Astronomers 
National  Geographic  Society 

Optical  Society  of  America,  National  Capital  Section 
Pest  Science  Society  of  America 
Philosophical  Society  of  Washington 
Society  for  Experimental  Biology  and  Medicine 
Society  of  American  Foresters,  National  Capital  Society 
Society  of  American  Military  Engineers,  Washington  DC 
Post 

Society  of  Manufacturing  Engineers,  Washington  DC 
Chapter 

Society  of  Mining,  Metallurgy,  and  Exploration,  Inc., 
Washington  DC  Section 

Soil  and  Water  Conservation  Society,  National  Capital 
Chapter 

Technology  Transfer  Society,  Washington  Area  Chapter 
Virginia  Native  Plant  Society,  Potowmack  Chapter 
Washington  DC  Chapter  of  the  Institute  for  Operations 

Research  and  the  Management  Sciences  (WINFORMS) 
Washington  Evolutionary  Systems  Society 
Washington  Elistory  of  Science  Club 
Washington  Paint  Technology  Group 
Washington  Society  of  Engineers 
Washington  Society  for  the  History  of  Medicine 
Washington  Statistical  Society 

World  Future  Society,  National  Capital  Region  Chapter 
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